• Title/Summary/Keyword: Music Dataset

Search Result 24, Processing Time 0.02 seconds

A Selection of Optimal EEG Channel for Emotion Analysis According to Music Listening using Stochastic Variables (확률변수를 이용한 음악에 따른 감정분석에의 최적 EEG 채널 선택)

  • Byun, Sung-Woo;Lee, So-Min;Lee, Seok-Pil
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.11
    • /
    • pp.1598-1603
    • /
    • 2013
  • Recently, researches on analyzing relationship between the state of emotion and musical stimuli are increasing. In many previous works, data sets from all extracted channels are used for pattern classification. But these methods have problems in computational complexity and inaccuracy. This paper proposes a selection of optimal EEG channel to reflect the state of emotion efficiently according to music listening by analyzing stochastic feature vectors. This makes EEG pattern classification relatively simple by reducing the number of dataset to process.

A MapReduce-based Artificial Neural Network Churn Prediction for Music Streaming Service

  • Chen, Min
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.1
    • /
    • pp.55-60
    • /
    • 2022
  • Churn prediction is a critical long-term problem for many business like music, games, magazines etc. The churn probability can be used to study many aspects of a business including proactive customer marketing, sales prediction, and churn-sensitive pricing models. It is quite challenging to design machine learning model to predict the customer churn accurately due to the large volume of the time-series data and the temporal issues of the data. In this paper, a parallel artificial neural network is proposed to create a highly-accurate customer churn model on a large customer dataset. The proposed model has achieved significant improvement in the accuracy of churn prediction. The scalability and effectiveness of the proposed algorithm is also studied.

A Musical Genre Classification Method Based on the Octave-Band Order Statistics (옥타브밴드 순서 통계량에 기반한 음악 장르 분류)

  • Seo, Jin Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.1
    • /
    • pp.81-86
    • /
    • 2014
  • This paper presents a study on the effectiveness of using the spectral and the temporal octave-band order statistics for musical genre classification. In order to represent the relative disposition of the harmonic and non-harmonic components, we utilize the octave-band order statistics of power spectral distribution. Experiments on the widely used two music datasets were performed; the results show that the octave-band order statistics improve genre classification accuracy by 2.61 % for one dataset and 8.9 % for another dataset compared with the mel-frequency cepstral coefficients and the octave-band spectral contrast. Experimental results show that the octave-band order statistics are promising for musical genre classification.

A Study on the Efficient Feature Vector Extraction for Music Information Retrieval System (음악 정보검색 시스템을 위한 효율적인 특징 벡터 추출에 관한 연구)

  • 윤원중;이강규;박규식
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.7
    • /
    • pp.532-539
    • /
    • 2004
  • In this Paper, we propose a content-based music information retrieval (MIR) system base on the query-by-example (QBE) method. The proposed system is implemented to retrieve queried music from a dataset where 60 music samples were collected for each of the four genres in Classical, Hiphop. Jazz. and Reck. resulting in 240 music files in database. From each query music signal, the system extracts 60 dimensional feature vectors including spectral centroid. rolloff. flux base on STFT and also the LPC. MFCC and Beat information. and retrieves queried music from a trained database set using Euclidean distance measure. In order to choose optimum features from the 60 dimension feature vectors, SFS method is applied to draw 10 dimension optimum features and these are used for the Proposed system. From the experimental result. we can verify the superior performance of the proposed system that provides success rate of 84% in Hit Rate and 0.63 in MRR which means near 10% improvements over the previous methods. Additional experiments regarding system Performance to random query Patterns (or portions) and query lengths have been investigated and a serious instability problem of system Performance is Pointed out.

Brainwave-based Mood Classification Using Regularized Common Spatial Pattern Filter

  • Shin, Saim;Jang, Sei-Jin;Lee, Donghyun;Park, Unsang;Kim, Ji-Hwan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.2
    • /
    • pp.807-824
    • /
    • 2016
  • In this paper, a method of mood classification based on user brainwaves is proposed for real-time application in commercial services. Unlike conventional mood analyzing systems, the proposed method focuses on classifying real-time user moods by analyzing the user's brainwaves. Applying brainwave-related research in commercial services requires two elements - robust performance and comfortable fit of. This paper proposes a filter based on Regularized Common Spatial Patterns (RCSP) and presents its use in the implementation of mood classification for a music service via a wireless consumer electroencephalography (EEG) device that has only 14 pins. Despite the use of fewer pins, the proposed system demonstrates approximately 10% point higher accuracy in mood classification, using the same dataset, compared to one of the best EEG-based mood-classification systems using a skullcap with 32 pins (EU FP7 PetaMedia project). This paper confirms the commercial viability of brainwave-based mood-classification technology. To analyze the improvements of the system, the changes of feature variations after applying RCSP filters and performance variations between users are also investigated. Furthermore, as a prototype service, this paper introduces a mood-based music list management system called MyMusicShuffler based on the proposed mood-classification method.

Music Genre Classification using Spikegram and Deep Neural Network (스파이크그램과 심층 신경망을 이용한 음악 장르 분류)

  • Jang, Woo-Jin;Yun, Ho-Won;Shin, Seong-Hyeon;Cho, Hyo-Jin;Jang, Won;Park, Hochong
    • Journal of Broadcast Engineering
    • /
    • v.22 no.6
    • /
    • pp.693-701
    • /
    • 2017
  • In this paper, we propose a new method for music genre classification using spikegram and deep neural network. The human auditory system encodes the input sound in the time and frequency domain in order to maximize the amount of sound information delivered to the brain using minimum energy and resource. Spikegram is a method of analyzing waveform based on the encoding function of auditory system. In the proposed method, we analyze the signal using spikegram and extract a feature vector composed of key information for the genre classification, which is to be used as the input to the neural network. We measure the performance of music genre classification using the GTZAN dataset consisting of 10 music genres, and confirm that the proposed method provides good performance using a low-dimensional feature vector, compared to the current state-of-the-art methods.

Analysis of YouTube Viewers' Characteristics and Responses to Virtual Idols (버추얼 아이돌에 대한 유튜브 시청자 특성과 반응 분석)

  • JeongYoon Kang;Choonsung Shin;Hieyong Jeong
    • Journal of Information Technology Services
    • /
    • v.23 no.3
    • /
    • pp.103-118
    • /
    • 2024
  • Due to the advancement of virtual reality technology, virtual idols are widely used in industrial and cultural content industries. However, it is difficult to utilize virtual idols' social perceptions because they are not properly understood. Therefore, this paper collected and analyzed YouTube comments to identify differences about social perception through comparative analysis between virtual idols and general idols. The dataset was constructed by crawling comments from music videos with more than 10 million views of virtual idols and more than 10,000 comments. Keyword frequency and TF-IDF values were derived from the collected dataset, and the connection centrality CONCOR cluster was analyzed with a semantic network using the UCINET program. As a result of the analysis, it was found that virtual idols frequently used keywords such as "person," "quality," "character," "reality," "animation," while reactions and perceptions were derived from general idols. Based on the results of this analysis, it was found that while general idols are mainly evaluated with their appearance and cultural factors, social perceptions of virtual idols' values are mixed with evaluations of cultural factors such as "song," "voice," and "choreography," focusing on technical factors such as "people," "quality," "character," and "animation." However, keywords such as "song," "voice," "choreography," and "music" are included in the top 30 like regular idols and appear in the same cluster, suggesting that virtual idols are gradually shifting away from minority tastes to mainstream culture. This study aims to provide academic and practical implications for the future expansion of the industry and cultural content industry of virtual idols by grasping the social perception of virtual idols.

Humming: Image Based Automatic Music Composition Using DeepJ Architecture (허밍: DeepJ 구조를 이용한 이미지 기반 자동 작곡 기법 연구)

  • Kim, Taehun;Jung, Keechul;Lee, Insung
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.5
    • /
    • pp.748-756
    • /
    • 2022
  • Thanks to the competition of AlphaGo and Sedol Lee, machine learning has received world-wide attention and huge investments. The performance improvement of computing devices greatly contributed to big data processing and the development of neural networks. Artificial intelligence not only imitates human beings in many fields, but also seems to be better than human capabilities. Although humans' creation is still considered to be better and higher, several artificial intelligences continue to challenge human creativity. The quality of some creative outcomes by AI is as good as the real ones produced by human beings. Sometimes they are not distinguishable, because the neural network has the competence to learn the common features contained in big data and copy them. In order to confirm whether artificial intelligence can express the inherent characteristics of different arts, this paper proposes a new neural network model called Humming. It is an experimental model that combines vgg16, which extracts image features, and DeepJ's architecture, which excels in creating various genres of music. A dataset produced by our experiment shows meaningful and valid results. Different results, however, are produced when the amount of data is increased. The neural network produced a similar pattern of music even though it was a different classification of images, which was not what we were aiming for. However, these new attempts may have explicit significance as a starting point for feature transfer that will be further studied.

Towards Low Complexity Model for Audio Event Detection

  • Saleem, Muhammad;Shah, Syed Muhammad Shehram;Saba, Erum;Pirzada, Nasrullah;Ahmed, Masood
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.9
    • /
    • pp.175-182
    • /
    • 2022
  • In our daily life, we come across different types of information, for example in the format of multimedia and text. We all need different types of information for our common routines as watching/reading the news, listening to the radio, and watching different types of videos. However, sometimes we could run into problems when a certain type of information is required. For example, someone is listening to the radio and wants to listen to jazz, and unfortunately, all the radio channels play pop music mixed with advertisements. The listener gets stuck with pop music and gives up searching for jazz. So, the above example can be solved with an automatic audio classification system. Deep Learning (DL) models could make human life easy by using audio classifications, but it is expensive and difficult to deploy such models at edge devices like nano BLE sense raspberry pi, because these models require huge computational power like graphics processing unit (G.P.U), to solve the problem, we proposed DL model. In our proposed work, we had gone for a low complexity model for Audio Event Detection (AED), we extracted Mel-spectrograms of dimension 128×431×1 from audio signals and applied normalization. A total of 3 data augmentation methods were applied as follows: frequency masking, time masking, and mixup. In addition, we designed Convolutional Neural Network (CNN) with spatial dropout, batch normalization, and separable 2D inspired by VGGnet [1]. In addition, we reduced the model size by using model quantization of float16 to the trained model. Experiments were conducted on the updated dataset provided by the Detection and Classification of Acoustic Events and Scenes (DCASE) 2020 challenge. We confirm that our model achieved a val_loss of 0.33 and an accuracy of 90.34% within the 132.50KB model size.

Deep Learning based Raw Audio Signal Bandwidth Extension System (딥러닝 기반 음향 신호 대역 확장 시스템)

  • Kim, Yun-Su;Seok, Jong-Won
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1122-1128
    • /
    • 2020
  • Bandwidth Extension refers to restoring and expanding a narrow band signal(NB) that is damaged or damaged in the encoding and decoding process due to the lack of channel capacity or the characteristics of the codec installed in the mobile communication device. It means converting to a wideband signal(WB). Bandwidth extension research mainly focuses on voice signals and converts high bands into frequency domains, such as SBR (Spectral Band Replication) and IGF (Intelligent Gap Filling), and restores disappeared or damaged high bands based on complex feature extraction processes. In this paper, we propose a model that outputs an bandwidth extended signal based on an autoencoder among deep learning models, using the residual connection of one-dimensional convolutional neural networks (CNN), the bandwidth is extended by inputting a time domain signal of a certain length without complicated pre-processing. In addition, it was confirmed that the damaged high band can be restored even by training on a dataset containing various types of sound sources including music that is not limited to the speech.