Bioinformatics services for analyzing massive genomic datasets |
Ko, Gunhwan
(Korea Bioinformation Center (KOBIC), KRIBB)
Kim, Pan-Gyu (Korea Bioinformation Center (KOBIC), KRIBB) Cho, Youngbum (Genome Editing Research Center, KRIBB) Jeong, Seongmun (Genome Editing Research Center, KRIBB) Kim, Jae-Yoon (Genome Editing Research Center, KRIBB) Kim, Kyoung Hyoun (Genome Editing Research Center, KRIBB) Lee, Ho-Yeon (Genome Editing Research Center, KRIBB) Han, Jiyeon (Department of BioInformation Science, Ewha Womans University) Yu, Namhee (Department of BioInformation Science, Ewha Womans University) Ham, Seokjin (Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH)) Jang, Insoon (Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH)) Kang, Byunghee (Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH)) Shin, Sunguk (Department of Systems, Biology Division of Life Sciences, and Institute for Life Science and Biotechnology, Yonsei University) Kim, Lian (Bioposh Inc.) Lee, Seung-Won (SeqGenesis) Nam, Dougu (School of Life Sciences, Ulsan National Institute of Science and Technology) Kim, Jihyun F. (Department of Systems, Biology Division of Life Sciences, and Institute for Life Science and Biotechnology, Yonsei University) Kim, Namshin (Genome Editing Research Center, KRIBB) Kim, Seon-Young (Genome Structure Research Center, KRIBB) Lee, Sanghyuk (Department of BioInformation Science, Ewha Womans University) Roh, Tae-Young (Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH)) Lee, Byungwook (Korea Bioinformation Center (KOBIC), KRIBB) |
1 | Bansal V, Boucher C. Sequencing technologies and analyses: where have we been and where are we going? iScience 2019;18:37-41. DOI |
2 | Kodama Y, Shumway M, Leinonen R; International Nucleotide Sequence Database Collaboration. The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res 2012;40:D54-D56. DOI |
3 | O'Driscoll A, Daugelaite J, Sleator RD. 'Big data', Hadoop and cloud computing in genomics. J Biomed Inform 2013;46:774-781. DOI |
4 | Langmead B, Nellore A. Cloud computing for genomic data analysis and collaboration. Nat Rev Genet 2018;19:208-219. DOI |
5 | Zhou S, Liao R, Guan J. When cloud computing meets bioinformatics: a review. J Bioinform Comput Biol 2013;11:1330002. DOI |
6 | Navale V, Bourne PE. Cloud computing applications for biomedical science: a perspective. PLoS Comput Biol 2018;14:e1006144. DOI |
7 | Taylor RC. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 2010;11 Suppl 12:S1. DOI |
8 | Jeong S, Kim JY, Jeong SC, Kang ST, Moon JK, Kim N. GenoCore: a simple and fast algorithm for core subset selection from large genotype datasets. PLoS One 2017;12:e0181420. DOI |
9 | Jeong S, Kim J, Park W, Jeon H, Kim N. SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing. PLoS One 2017;12:e0184087. DOI |
10 | Jang YE, Jang I, Kim S, Cho S, Kim D, Kim K, et al. ChimerDB 4.0: an updated and expanded database of fusion genes. Nucleic Acids Res 2020;48:D817-D824. |
11 | Lee S, Seo CH, Alver BH, Lee S, Park PJ. EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering. BMC Bioinformatics 2015;16:278. DOI |
12 | Jeong I, Yu N, Jang I, Jun Y, Kim MS, Choi J, et al. GEMiCCL: mining genotype and expression data of cancer cell lines with elaborate visualization. Database (Oxford) 2018;2018:bay041. DOI |
13 | Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 2012;7:562-578. DOI |
14 | Ghosh S, Chan CK. Analysis of RNA-Seq data using TopHat and Cufflinks. Methods Mol Biol 2016;1374:339-361. DOI |
15 | Law CW, Chen Y, Shi W, Smyth GK. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 2014;15:R29. DOI |
16 | Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 2011;12:323. DOI |
17 | Anders S, Pyl PT, Huber W. HTSeq: a Python framework to work with high-throughput sequencing data. Bioinformatics 2015;31:166-169. DOI |
18 | Zhu W, Lomsadze A, Borodovsky M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res 2010;38:e132. DOI |
19 | Yang D, Jang I, Choi J, Kim MS, Lee AJ, Kim H, et al. 3DIV: A 3D-genome Interaction Viewer and database. Nucleic Acids Res 2018;46:D52-D57. DOI |
20 | Jiang H, Wang F, Dyer NP, Wong WH. CisGenome Browser: a flexible tool for genomic data visualization. Bioinformatics 2010;26:1781-1782. DOI |
21 | Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol 2008;9:R137. DOI |
22 | Xu S, Grullon S, Ge K, Peng W. Spatial clustering for identification of ChIP-enriched regions (SICER) to map regions of histone methylation patterns in embryonic stem cells. Methods Mol Biol 2014;1150:97-111. DOI |
23 | Rozowsky J, Euskirchen G, Auerbach RK, Zhang ZD, Gibson T, Bjornson R, et al. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 2009;27:66-75. DOI |
24 | Narlikar L, Jothi R. ChIP-Seq data analysis: identification of protein-DNA binding sites with SISSRs peak-finder. Methods Mol Biol 2012;802:305-322. DOI |
25 | Lamy P, Wiuf C, Orntoft TF, Andersen CL. Rseg: an R package to optimize segmentation of SNP array data. Bioinformatics 2011;27:419-420. DOI |
26 | Starmer J, Magnuson T. Detecting broad domains and narrow peaks in ChIP-seq data with hiddenDomains. BMC Bioinformatics 2016;17:144. DOI |
27 | Wang J, Lunyak VV, Jordan IK. BroadPeak: a novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets. Bioinformatics 2013;29:492-493. DOI |
28 | Feng X, Grossman R, Stein L. PeakRanger: a cloud-enabled peak caller for ChIP-seq data. BMC Bioinformatics 2011;12:139. DOI |
29 | Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 2010;20:265-272. DOI |
30 | Huson DH, Auch AF, Qi J, Schuster SC. MEGAN analysis of metagenomic data. Genome Res 2007;17:377-386. DOI |
31 | Eddy SR. Profile hidden Markov models. Bioinformatics 1998;14:755-763. DOI |
32 | Mun J, Kim DU, Hoe KL, Kim SY. Genome-wide functional analysis using the barcode sequence alignment and statistical analysis (Barcas) tool. BMC Bioinformatics 2016;17:475. DOI |
33 | Tang ZZ, Chen G, Alekseyenko AV. PERMANOVA-S: association test for microbial community composition that accommodates confounders and multiple distances. Bioinformatics 2016;32:2618-2625. DOI |
34 | Kim E, Bae D, Yang S, Ko G, Lee S, Lee B, et al. BiomeNet: a database for construction and analysis of functional interaction networks for any species with a sequenced genome. Bioinformatics 2020;36:1584-1589. |
35 | Chi SM, Kim J, Kim SY, Nam D. ADGO 2.0: interpreting microarray data and list of genes using composite annotations. Nucleic Acids Res 2011;39:W302-W306. DOI |
36 | Yoon S, Kim J, Kim SK, Baik B, Chi SM, Kim SY, et al. GScluster: network-weighted gene-set clustering analysis. BMC Genomics 2019;20:352. DOI |
37 | Nam D, Kim J, Kim SY, Kim S. GSA-SNP: a general approach for gene set analysis of polymorphisms. Nucleic Acids Res 2010;38:W749-W754. DOI |
38 | Yoon S, Nguyen HC, Yoo YJ, Kim J, Baik B, Kim S, et al. Efficient pathway enrichment and network analysis of GWAS summary data using GSA-SNP2. Nucleic Acids Res 2018;46:e60. DOI |