Acknowledgement
본 연구는 문화체육관광부 및 한국콘텐츠진흥원의 2022년도 저작권기술 연구개발사업으로 수행되었음(과제명: 딥러닝을 활용한 고속 음악 탐색 기술개발, 과제번호: CR202104004)
References
- Y. V. S. Murthy and S. G. Koolagudi, "Content-based music information retrieval and its applications toward the music industry: A review," ACM Comput. Surv. 51, 1-46 (2019).
- J. S. Seo, J. Kim, and J. Park, "Centroid-model based music similarity with alpha divergence" (in Korean), J. Acoust. Soc. Kr. 35, 83-91 (2016). https://doi.org/10.7776/ASK.2016.35.2.083
- F. Yesiler, G. Doras, R. M. Bittner, C. J. Tralie, and J. Serra, "Audio-based musical version identification: Elements and challenges," IEEE Signal Process. Mag. 38, 115-136 (2021).
- J. Serra, E. Gomez, P. Herrera, and X. Serra, "Chroma binary similarity and local alignment applied to cover song identification," IEEE Trans. Audio Speech Lang. Process, 16, 1138-1151 (2008). https://doi.org/10.1109/TASL.2008.924595
- J. S. Seo, "Cover song search based on magnitude and phase of the 2D Fourier transform" (in Korean), J. Acoust. Soc. Kr. 37, 518-524 (2018).
- G. Doras and G. Peeters, "Cover detection using dominant melody embeddings," Proc. ISMIR, 107-114 (2019).
- F. Yesiler, J. Serra, and E. Gomez, "Accurate and scalable version identification using musically-motivated embeddings," Proc. ICASSP, 21-25 (2020).
- X. Du, Z. Yu, B. Zhu, X. Chen, and Z. Ma, "Bytecover: Cover song identification via multi-loss training," Proc. ICASSP, 551-555 (2021).
- S. Prince, P. Li, Y. Fu, U. Mohammed, and J. Elder, "Probabilistic models for inference about identity," IEEE TPAMI, 34, 144-157 (2012). https://doi.org/10.1109/TPAMI.2011.104
- P. Rajan, A. Afanasyev, V Hautamaki, and T. Kinnunen, "From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification," Digit. Signal Process. 31, 93-101 (2014). https://doi.org/10.1016/j.dsp.2014.05.001
- D. Snyder, D. Garcia-Romero, G. Sell, A. McCree, D. Povey, and S. Khudanpur, "Speaker recognition for multi-speaker conversations using x-vectors," Proc. ICASSP, 5796-5800 (2019).
- B. McFee and J. P. Bello, "Structured training for large-vocabulary chord recognition," Proc. ISMIR, 188-194 (2017).
- A. Hermans, L. Beyer, and B. Leibe, "In defense of the triplet loss for person re-identification," arXiv: 1703.07737 (2017).
- H. Luo, Y. Gu, X. Liao, S. Lai, and W. Jiang, "Bag of tricks and a strong baseline for deep person re-identification," Proc. CVPR workshops, 1487-1495 (2019).
- F. Yesiler, C. Tralie, A. Correya, D. F. Silva, P. Tovstogan, E. Gomez, and X. Serra, "Da-TACOS: A dataset for cover song identification and understanding," Proc. ISMIR, 327-334 (2019).
- Covers80 Cover Song Data Set, http://labrosa.ee.columbia.edu/projects/coversongs/covers80/, (Last viewed February 1, 2017).
- F. Yesiler, J. Serra, and E. Gomez, "Less is more: Faster and better music version identification with embedding distillation," Proc. ISMIR, 884-892 (2020).