Browse > Article
http://dx.doi.org/10.9717/kmms.2022.25.5.748

Humming: Image Based Automatic Music Composition Using DeepJ Architecture  

Kim, Taehun (School of Global Media, College of IT, Soongsil University)
Jung, Keechul (School of Global Media, College of IT, Soongsil University)
Lee, Insung (Dept. of English, College of Humanities, Soongsil University)
Publication Information
Abstract
Thanks to the competition of AlphaGo and Sedol Lee, machine learning has received world-wide attention and huge investments. The performance improvement of computing devices greatly contributed to big data processing and the development of neural networks. Artificial intelligence not only imitates human beings in many fields, but also seems to be better than human capabilities. Although humans' creation is still considered to be better and higher, several artificial intelligences continue to challenge human creativity. The quality of some creative outcomes by AI is as good as the real ones produced by human beings. Sometimes they are not distinguishable, because the neural network has the competence to learn the common features contained in big data and copy them. In order to confirm whether artificial intelligence can express the inherent characteristics of different arts, this paper proposes a new neural network model called Humming. It is an experimental model that combines vgg16, which extracts image features, and DeepJ's architecture, which excels in creating various genres of music. A dataset produced by our experiment shows meaningful and valid results. Different results, however, are produced when the amount of data is increased. The neural network produced a similar pattern of music even though it was a different classification of images, which was not what we were aiming for. However, these new attempts may have explicit significance as a starting point for feature transfer that will be further studied.
Keywords
Artificial Intelligence; Machine Learning; Neural Network; Music Composition;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Dieleman, A.V.D. Oord, and K. Simonyan, "The Challenge of Realistic Music Generation: Modelling Raw Audio at Scale," arXiv P reprint, arXiv:1806.10474, 2018.
2 H.H. Mao, T. Shin, and G. Cottrell, "DeepJ: Style-Specific Music Generation," IEEE 12th International Conference on Semantic Computing (ICSC), pp. 377-382, 2018.
3 J.P. Briot, G. Hadjeres, and F.D. Pachet, "Deep Learning Techniques for Music Generation--A Survey," arXiv preprint, arXiv:1709.01620, 2017.
4 S. Hochreiter, and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, Vol. 9, No. 8, pp. 1735-1780, 1997.   DOI
5 D. Makris, M. Kaliakatsos-Papakostas, I. Karydis, and K.L. Kermanidis, "Combining LSTM and Feed Forward Neural Networks for Conditional Rhythm Composition," InterNational Conference on Engineering Applications of Neural Networks, Springer, Cham, pp. 570-582, 2017.
6 D.D Johnson, "Generating Polyphonic Music using Tied Parallel Networks," International Conference on Evolutionary and Biologically Inspired Music and Art, Springer, Cham, pp. 128-143, 2017.
7 K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv preprint, arXiv:1409.1556, 2014.
8 D. Eck and J. Schmidhuber, "A First Look at Music Composition using LSTM Recurrent Neural Networks," Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale, Vol. 103, pp. 48, 2002.
9 H.J. Choi, J.-H. Hwang, S. Ryu, and S. Kim, "Music Generation Algorithm based on the Color-Emotional Effect of a Painting," Journal of Korea Multimedia Society, Vol. 23, No. 6, pp. 765-771, 2020.   DOI
10 D.P. Kingma and M. Welling, "Auto-Encoding Variational Bayes," arXiv preprint, arXiv:1312.6114, 2013.
11 I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., "Generative Adversarial Nets," Proceedings of the 27th International Conference on Neural Information Processing Systems, Vol. 2, pp. 2672-2680, 2014.