A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages

Park, Seung-Jin;Kim, Jong-Hwan;Yoon, Byung-Ha;Kim, Seon-Young;

doi:10.5808/GI.2017.15.1.11

Genomics & Informatics

Volume 15 Issue 1
/
Pages.11-18
/
2017
/
1598-866X(pISSN)
/
2234-0742(eISSN)

Korea Genome Organization (한국유전체학회)

DOI QR Code

A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages

Park, Seung-Jin (Personalized Genomic Medicine Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
Kim, Jong-Hwan (Personalized Genomic Medicine Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
Yoon, Byung-Ha (Personalized Genomic Medicine Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
Kim, Seon-Young (Personalized Genomic Medicine Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB))

Received : 2017.01.31
Accepted : 2017.03.06
Published : 2017.03.31

https://doi.org/10.5808/GI.2017.15.1.11 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Nowadays, huge volumes of chromatin immunoprecipitation-sequencing (ChIP-Seq) data are generated to increase the knowledge on DNA-protein interactions in the cell, and accordingly, many tools have been developed for ChIP-Seq analysis. Here, we provide an example of a streamlined workflow for ChIP-Seq data analysis composed of only four packages in Bioconductor: dada2, QuasR, mosaics, and ChIPseeker. 'dada2' performs trimming of the high-throughput sequencing data. 'QuasR' and 'mosaics' perform quality control and mapping of the input reads to the reference genome and peak calling, respectively. Finally, 'ChIPseeker' performs annotation and visualization of the called peaks. This workflow runs well independently of operating systems (e.g., Windows, Mac, or Linux) and processes the input fastq files into various results in one run. R code is available at github: https://github.com/ddhb/Workflow_of_Chipseq.git.

Keywords

References

Mundade R, Ozer HG, Wei H, Prabhu L, Lu T. Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond. Cell Cycle 2014;13:2847-2852. https://doi.org/10.4161/15384101.2014.949201
Gentsch GE, Smith JC. Efficient preparation of high-complexity ChIP-Seq profiles from early Xenopus embryos. Methods Mol Biol 2017;1507:23-42.
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods 2012;9:357-359. https://doi.org/10.1038/nmeth.1923
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009;25:1754-1760. https://doi.org/10.1093/bioinformatics/btp324
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol 2008;9:R137. https://doi.org/10.1186/gb-2008-9-9-r137
Shao Z, Zhang Y, Yuan GC, Orkin SH, Waxman DJ. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets. Genome Biol 2012;13:R16. https://doi.org/10.1186/gb-2012-13-3-r16
Huang W, Loganantharaj R, Schroeder B, Fargo D, Li L. PAVIS: a tool for Peak Annotation and Visualization. Bioinformatics 2013;29:3097-3099. https://doi.org/10.1093/bioinformatics/btt520
Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods 2016;13:581-583. https://doi.org/10.1038/nmeth.3869
Gaidatzis D, Lerch A, Hahne F, Stadler MB. QuasR: quantification and annotation of short reads in R. Bioinformatics 2015;31:1130-1132. https://doi.org/10.1093/bioinformatics/btu781
Kuan PF, Chung D, Pan G, Thomson JA, Stewart R, Keles S. A statistical framework for the analysis of ChIP-Seq data. J Am Stat Assoc 2011;106:891-903. https://doi.org/10.1198/jasa.2011.ap09706
Yu G, Wang LG, He QY. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 2015;31:2382-2383. https://doi.org/10.1093/bioinformatics/btv145
ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 2004;306:636-640. https://doi.org/10.1126/science.1105136
Au KF, Jiang H, Lin L, Xing Y, Wong WH. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res 2010;38:4570-4578. https://doi.org/10.1093/nar/gkq211
Chung D, Zhang Q, Keles S. MOSAiCS-HMM: a model-based approach for detecting regions of histone modifications from ChIP-Seq data. In: Statistical Analysis of Next Generation Sequencing Data (Datta S, Nettleton D, eds.). New York: Springer, 2014. pp. 277-295.
Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods 2015;12:115-121. https://doi.org/10.1038/nmeth.3252

Cited by

SeqAcademy: an educational pipeline for RNA-Seq and ChIP-Seq analysis vol.7, pp.2046-1402, 2018, https://doi.org/10.12688/f1000research.14880.2
Aberrant activation of non-coding RNA targets of transcriptional elongation complexes contributes to TDP-43 toxicity vol.9, pp.1, 2018, https://doi.org/10.1038/s41467-018-06543-0
SeqAcademy: an educational pipeline for RNA-Seq and ChIP-Seq analysis vol.7, pp.2046-1402, 2018, https://doi.org/10.12688/f1000research.14880.1

Genomics & Informatics

A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)