Deep Learning in Genomic and Medical Image Data Analysis: Challenges and Approaches |
Yu, Ning
(Dept. of Informatics, University of South Carolina Upstate)
Yu, Zeng (School of Information Science and Technology, Southwest Jiaotong University) Gu, Feng (Dept. of Computer Science, College of Staten Island) Li, Tianrui (School of Information Science and Technology, Southwest Jiaotong University) Tian, Xinmin (Intel Compilers and Languages, SSG, Intel Corporation, Intel Corporation) Pan, Yi (Dept. of Computer Science, Georgia State University) |
1 | X. Glorot, and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), Sardinia, Italy, 2010, pp. 249-256. |
2 | K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representations using RNN encoder-decoder for statistical machine translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1724- 1734. |
3 | Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, et al., "Google's neural machine translation system: bridging the gap between human and machine translation," 2016 [Online]. Available: https://arxiv.org/pdf/1609.08144.pdf. |
4 | N. Yu, X. Guo, F. Gu, and Y. Pan, "Signalign: an ontology of DNA as signal for comparative gene structure prediction using information-coding-and-processing techniques," IEEE Transactions on Nanobioscience, vol. 15, no. 2, pp. 119-130, 2016. DOI |
5 | N. Yu, X. Guo, F. Gu, Y. Pan. "DNA AS X: an information-coding-based model to improve the sensitivity in comparative gene analysis," in Proceedings of International Symposium on Bioinformatics Research and Applications (ISBRA), Norfolk, VA, 2015, pp. 366-377. |
6 | I. Cosic, "Macromolecular bioactivity: is it resonant interaction between macromolecules? Theory and applications," IEEE Transactions on Biomedical Engineering, vol. 41, no. 12, pp. 1101-1114, 1994. DOI |
7 | S. B. Arniker, H. K. Kwan, N. F. Law and D. P. K. Lun, "DNA numerical representation and neural network based human promoter prediction system," in Proceedings of 2011 Annual IEEE India Conference (INDICON), Hyderabad, India, 2011, pp. 1-4. |
8 | G. Kauer and H. Blocker, "Applying signal theory to the analysis of biomolecules," Bioinformatics, vol. 19, no. 16, pp. 2016-2021, 2003. DOI |
9 | K. Jabbari and G. Bernardi, "Cytosine methylation and CpG, TpG (CpA) and TpA frequencies," Gene, vol. 333, pp. 143-149, 2004. DOI |
10 | G. L. Rosen, "Examining coding structure and redundancy in DNA," IEEE Engineering in Medicine and Biology Magazine, vol. 25, no. 1, pp. 62-68, 2006. DOI |
11 | K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 770-778. |
12 | D. Mishkin and J. Matas, "All you need is a good init," in Proceedings of the International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, 2016, pp. 1-13. |
13 | I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning," in Proceedings of the International Conference on Machine Learning (ICML), Atlanta, GA, 2013, pp. 1139-1147. |
14 | H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin, "Exploring strategies for training deep neural networks," Journal of Machine Learning Research, vol. 10, pp. 1-40, 2009. |
15 | R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training recurrent neural networks," in Proceedings of the International Conference on Machine Learning (ICML), Atlanta, GA, 2013, pp. 1310-1318. |
16 | M. Bianchini, and F. Scarselli, "On the complexity of shallow and deep neural network classifiers," in Proceedings of 22nd European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium, 2014, pp. 371-376. |
17 | J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, vol. 12, pp. 2121-2159, 2011. |
18 | C. Szegedy, S. Ioffe, and V. Vanhoucke, "Inception-v4, inception-ResNet and the impact of residual connections on learning," 2016 [Online]. Available: https://arxiv.org/pdf/1602.07261.pdf. |
19 | Y. Bengio, "Practical recommendations for gradient-based training of deep architectures," in Neural Networks: Tricks of the Trade. Heidelberg: Springer, 2012, pp. 437-478. |
20 | J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl, "Algorithms for hyper-parameter optimization," Advances in Neural Information Processing Systems, vol. 24, pp. 2546-2554, 2011. |
21 | D. Kingma and J. Ba, "Adam: a method for stochastic optimization," 2014 [Online]. Available: https://arxiv.org/ pdf/1412.6980.pdf. |
22 | J. F. G. de Freitas, "Bayesian methods for neural networks," Ph.D. dissertation, University of Cambridge, UK, 2000. |
23 | J. Snoek, H. Larochelle, and R. P. Adams, "Practical Bayesian optimization of machine learning algorithms," Advances in Neural Information Processing Systems, vol. 25, pp. 2951-2959, 2012. |
24 | F. Hutter, H. H. Hoos, and K. Leyton-Brown, "Sequential model-based optimization for general algorithm configuration," in Learning and Intelligent Optimization. Heidelberg: Springer, 2011, pp. 507-523. |
25 | J. Bergstra and Y. Bengio, "Random search for hyper-parameter optimization," Journal of Machine Learning Research, vol. 13, pp. 281-305, 2012. |
26 | L. Deng, M. Seltzer, D. Yu, A. Acero, A. Mohamed, and G. E. Hinton, "Binary coding of speech spectrograms using a deep auto-encoder," in Proceedings of 11th annual conference of the International Speech Communication Association (INTERSPEECH2010), Makuhari, Japan, 2010, pp. 1692- 1695. |
27 | A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, vol. 25, pp. 1097-1105, 2012. |
28 | J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, Q. V. Le, and A. Y. Ng, "On optimization methods for deep learning," in Proceedings of the 28th International Conference on Machine Learning (ICML), Bellevue, WA, 2011, pp. 265-272. |
29 | J. Martens, "Deep learning via Hessian-free optimization," in Proceedings of the 27th International Conference on Machine Learning (ICML), Haifa, Israel, 2010, pp. 735-742. |
30 | R. Raina, A. Madhavan, and A. Y. Ng, "Large-scale deep unsupervised learning using graphics processors," in Proceedings of the 26th Annual International Conference on Machine Learning (ICML), Montreal, Canada, 2009, pp. 873-880. |
31 | J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, et al., “Large scale distributed deep networks,”Advances in Neural Information Processing Systems, vol. 25, pp. 1223-1231, 2012. |
32 | Q. Ho, J. Cipar, H. Cui, S. Lee, J. K. Kim, P. B. Gibbons, G. A. Gibson, G. Ganger, and E. P. Xing, "More effective distributed ml via a stale synchronous parallel parameter server," Advances in Neural Information Processing Systems, vol. 26, pp. 1223-1231, 2013. |
33 | Y. Bengio, H. Schwenk, J. S. Senecal, F. Morin, and J. L. Gauvain, “Neural probabilistic language models,” inInnovations in Machine Learning. Heidelberg: Springer, 2006, pp. 137-186. |
34 | M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B. Y. Su,“Scaling distributed machine learning with the parameter server,” in Proceedings of the 11th USENIXSymposium on Operating Systems Design and Implementation (OSDI), Broomfield, CO, 2014, pp. 583-598. |
35 | C. Angermueller, T. Parnamaa, L. Parts, and O. Stegle, “Deep learning for computational biology,” MolecularSystems Biology, vol. 12, article no. 878, pp. 1-16, 2016. |
36 | Y. Bengio, A. Courville, and P. Vincent, "Representation learning: a review and new perspectives," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1798-1828, 2013. DOI |
37 | R. Socher, J. Pennington, E. Huang, A. Ng, and C. Manning, "Semi-supervised recursive autoencoders for predicting sentiment distributions," in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, UK, 2001, pp. 151-161. |
38 | Y. Bengio, "Learning deep architectures for AI," Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1-127, 2009. DOI |
39 | Y. Bengio, "Deep learning of representations: looking forward," in Proceedings of International Conference on Statistical Language and Speech Processing (SLAP), Tarragona, Spain, 2013, pp. 1-37. |
40 | Y. LeCun, Y. Bengio, and G. E. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015. DOI |
41 | S. Min, B. Lee, and S. Yoon. "Deep learning in bioinformatics," Briefings in Bioinformatics, 2016. http://doi.org/ 10.1093/bib/bbw068. |
42 | G. Hinton, P. Dayan, B. Frey, and R. Neal, "The wake-sleep algorithm for unsupervised neural networks," Science, vol. 268, no. 5214, pp. 1158-1161, 1995. DOI |
43 | G. E. Hinton, "Learning multiple layers of representation," Trends in Cognitive Sciences, vol. 11, no. 10, pp. 428-434, 2007. DOI |
44 | M. K. Leung, H. Y. Xiong, L. J. Lee, and B. J. Frey, "Deep learning of the tissue-regulated splicing code," Bioinformatics, vol. 30, no. 12, pp. i121-i129, 2014. DOI |
45 | H. Kin, J. Park, J. Jang, and S. Yoon, “DeepSpark: spark-based deep learning supporting asynchronous updatesand Caffe compatibility,” 2016 [Online]. Available: https://arxiv.org/pdf/1602.08191.pdf. |
46 | M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, et al., “Tensorflow: large-scale machine learningon heterogeneous distributed systems,” 2016 [Online]. Available: https://arxiv.org/pdf/1603.04467.pdf. |
47 | L. Deng, G. Hinton, and B. Kingsbury, "New types of deep neural network learning for speech recognition and related applications: an overview," in Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, BC, 2013, pp. 8599-8603. |
48 | P. Di Lena, K. Nagata, and P. Baldi, "Deep architectures for protein contact map prediction," Bioinformatics, vol. 28, no. 19, pp. 2449-2457, 2012. DOI |
49 | J. Eickholt and J. Cheng, "Predicting protein residue-residue contacts using deep networks and boosting," Bioinformatics, vol. 28, no. 23, pp. 3066-3072, 2012. DOI |
50 | C. Trapnell, L. Pachter, and S. L. Salzberg, "TopHat: discovering splice junctions with RNA-Seq," Bioinformatics, vol. 25, no. 9, pp. 1105-1111, 2009. DOI |
51 | M. K. K. Leung, A. Delong, B. Alipanahi, B. J. Frey, "Machine learning in genomic medicine: a review of computational problems and data sets," Proceedings of the IEEE, vol. 104, no. 1, pp 176-197, 2016. DOI |
52 | M. W. Libbrecht and W. S. Noble, "Machine learning applications in genetics and genomics," Nature Reviews Genetics, vol. 16, no. 6, pp. 321-332, 2015. DOI |
53 | N. Srivastava, G.E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, 2014. |
54 | S. Ioffe and C. Szegedy, "Batch normalization: accelerating deep network training by reducing internal covariate shift," in Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France, 2015, pp. 448-456. |
![]() |