1 |
Vogl, Richard, Matthias Dorfer, and Peter Knees. "RECURRENT NEURAL NETWORKS FOR DRUM TRANSCRIPTION." Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). (2016).
|
2 |
Choi, Keunwoo, George Fazekas, and Mark Sandler. "Automatic tagging using deep convolutional neural networks." Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). (2016).
|
3 |
Schluter, Jan, Karen Ullrich, and Thomas Grill. "Structural segmentation with convolutional neural networks mirex submission." 10th running of the Music Information Retrieval Evaluation eXchange (MIREX 2014) (2014).
|
4 |
Schluter, Jan. "Learning to pinpoint singing voice from weakly labeled examples." Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). (2016).
|
5 |
Leimeister, Matthias. "Feature learning for classifying drum components from nonnegative matrix factorization." Audio Engineering Society Convention 138. Audio Engineering Society, (2015).
|
6 |
Hinton, Geoffrey E. et al. "Reducing the dimensionality of data with neural networks." Science, (2006)
|
7 |
Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554.
DOI
|
8 |
Bengio, Yoshua, et al. "Greedy layer-wise training of deep networks." Advances in neural information processing systems 19 (2007): 153.
|
9 |
Hinton, Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." IEEE Signal Processing Magazine 29.6 (2012): 82-97.
DOI
|
10 |
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. (2012).
|
11 |
Donahue, Jeffrey, et al. "Long-term recurrent convolutional networks for visual recognition and description." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2015).
|
12 |
Dahl, George E., et al. "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition." IEEE Transactions on Audio, Speech, and Language Processing 20.1 (2012): 30-42.
DOI
|
13 |
Understanding LSTM Networks, http://colah.github.io/posts/2015-08-Understanding-LSTMs/
|
14 |
The Unreasonable Effectiveness of Recurrent Neural Networks, http://karpathy.github.io/2015/05/21/rnn-effectiveness/
|
15 |
Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." IEEE Transactions on Signal Processing, IEEE, (1997)
|
16 |
Amodei, Dario, et al. "Deep speech 2: End-to-end speech recognition in English and mandarin." arXiv preprint arXiv:1512.02595(2015).
|
17 |
Mikolov, Tomas, et al. "Strategies for training large scale neural network language models." Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on. IEEE, (2011).
|
18 |
Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. "Speech recognition with deep recurrent neural networks." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, (2013).
|
19 |
Hannun, Awni, et al. "Deep speech: Scaling up end-to-end speech recognition." arXiv preprint arXiv:1412.5567(2014).
|
20 |
A.W. Black, H. Zen, K. Tokuda, "Statistical parametric speech synthesis." 2007 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, (2007).
|
21 |
Speech Synthesis, http://slideplayer.com/slide/3148265/
|
22 |
http://www.slideshare.net/danilosoba1/statistical-parametric-speech-synthesis-heiga-zen
|
23 |
Zen, Heiga, Andrew Senior, and Mike Schuster. "Statistical parametric speech synthesis using deep neural networks." 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, (2013).
|
24 |
van den Oord, Aaron, et al. "Wavenet: A generative model for raw audio." arXiv preprint arXiv:1609.03499(2016)
|
25 |
Soroush Mehri, et al. " SampleRNN: An Unconditional End-to-End Neural Audio Generation Model." https://openreview.net/forum?id=SkxKPDv5, under review on ICLR 2017.
|
26 |
Southall, Carl, Ryan Stables, and Jason Hockman. "AUTOMATIC DRUM TRANSCRIPTION USING BI-DIRECTIONAL RECURRENT NEURAL NETWORKS." Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). (2016).
|