• Title/Summary/Keyword: Music Similarity

Search Result 88, Processing Time 0.031 seconds

A Content-based Music Similarity Retrieval System (내용 기반 음악 유사 구간 검색 시스템)

  • Kim, Hyunwoo;Han, Byeong-jun;Kim, Cheol-Hwan;Lee, Kyogu
    • Annual Conference of KIPS
    • /
    • 2010.11a
    • /
    • pp.732-735
    • /
    • 2010
  • 본 연구에서는 음악 데이터 베이스에서 노래의 특정 구간과 가장 유사한 구간을 검색하는 시스템을 제안한다. 제안된 시스템에서는 음악을 다차원 시계열 데이터로 간주하고, 음악의 조성 차이 및 템포(tempo) 차이를 고려한 음악의 유사도 계산 방법을 사용한다. 유사도 계산의 전처리 단계에서 조성 차이를 보정하고, 비트(beat)를 검출하며, 추출된 크로마그램(chromagram)을 검출된 비트와 동기화 하여 평균한다. 이후, 동적 시간 왜곡(DTW; dynamic time warping)을 사용하여 두 구간사이의 유사도를 계산한 후 계산된 유사도 순서로 정렬된 검색 결과를 출력한다. 사용자는 제안된 시스템을 사용하여 선택 구간 유사도 검색과 자동 유사 검색 결과로 도출된 구간 쌍을 검토하여 유사 구간을 보다 쉽게 찾을 수 있다.

A system for recommending audio devices based on frequency band analysis of vocal component in sound source (음원 내 보컬 주파수 대역 분석에 기반한 음향기기 추천시스템)

  • Jeong-Hyun, Kim;Cheol-Min, Seok;Min-Ju, Kim;Su-Yeon, Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.1-12
    • /
    • 2022
  • As the music streaming service and the Hi-Fi market grow, various audio devices are being released. As a result, consumers have a wider range of product choices, but it has become more difficult to find products that match their musical tastes. In this study, we proposed a system that extracts the vocal component from the user's preferred sound source and recommends the most suitable audio device to the user based on this information. To achieve this, first, the original sound source was separated using Python's Spleeter Library, the vocal sound source was extracted, and the result of collecting frequency band data of manufacturers' audio devices was shown in a grid graph. The Matching Gap Index (MGI) was proposed as an indicator for comparing the frequency band of the extracted vocal sound source and the measurement data of the frequency band of the audio devices. Based on the calculated MGI value, the audio device with the highest similarity with the user's preference is recommended. The recommendation results were verified using equalizer data for each genre provided by sound professional companies.

Feature-Based Image Retrieval using SOM-Based R*-Tree

  • Shin, Min-Hwa;Kwon, Chang-Hee;Bae, Sang-Hyun
    • Proceedings of the KAIS Fall Conference
    • /
    • 2003.11a
    • /
    • pp.223-230
    • /
    • 2003
  • Feature-based similarity retrieval has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects (e 'g', documents, images, video, music score, etc.). For example, images are represented by their color histograms, texture vectors, and shape descriptors, and are usually high-dimensional data. The performance of conventional multidimensional data structures(e'g', R- Tree family, K-D-B tree, grid file, TV-tree) tends to deteriorate as the number of dimensions of feature vectors increases. The R*-tree is the most successful variant of the R-tree. In this paper, we propose a SOM-based R*-tree as a new indexing method for high-dimensional feature vectors.The SOM-based R*-tree combines SOM and R*-tree to achieve search performance more scalable to high dimensionalities. Self-Organizing Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two dimensional space. The mapping preserves the topology of the feature vectors. The map is called a topological of the feature map, and preserves the mutual relationship (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. A best-matching-image-list. (BMIL) holds similar images that are closest to each codebook vector. In a topological feature map, there are empty nodes in which no image is classified. When we build an R*-tree, we use codebook vectors of topological feature map which eliminates the empty nodes that cause unnecessary disk access and degrade retrieval performance. We experimentally compare the retrieval time cost of a SOM-based R*-tree with that of an SOM and an R*-tree using color feature vectors extracted from 40, 000 images. The result show that the SOM-based R*-tree outperforms both the SOM and R*-tree due to the reduction of the number of nodes required to build R*-tree and retrieval time cost.

  • PDF

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

  • Kim, Minjeong;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.93-110
    • /
    • 2015
  • The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.

Interval-Based Singing Program for Improving the Accuracy of Pitch Production in Children With Cochlear Implants: A Case Study (음정 모방 중심 노래부르기를 통한 인공와우이식아동의 음고 산출 정확도 향상 사례)

  • Kim, Hyo Jin;Chong, Hyun Ju
    • Journal of Music and Human Behavior
    • /
    • v.14 no.1
    • /
    • pp.1-16
    • /
    • 2017
  • The purpose of this study was to examine changes in the accuracy and range of produced pitch in children with cochlear implants (CI) after an interval-based singing program. A total of three children with CIs aged 5 received twelve 35-minute individual sessions two to three times per week. The interval-based singing program was composed with third, fifth, and eighth intervals and implemented pitch discrimination, pitch imitation, and singing songs with target intervals in a sequence. At pretest and posttest, the changes in accuracy of pitch production during pitch imitation and singing were measured. The results demonstrated that all participants showed improvement in pitch accuracy and produced the target notes with great similarity to the expected pitches in the original song. The range of produced pitch also increased after the program. The results indicate that sequential trials to imitate pitches in a multisensory environment to facilitate the processing of pitch information may reflect how this population perceives pitch information and assist children with CIs to improve their pitch accuracy.

Algorithm to Search for the Original Song from a Cover Song Using Inflection Points of the Melody Line (멜로디 라인의 변곡점을 활용한 커버곡의 원곡 검색 알고리즘)

  • Lee, Bo Hyun;Kim, Myung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.5
    • /
    • pp.195-200
    • /
    • 2021
  • Due to the development of video sharing platforms, the amount of video uploads is exploding. Such videos often include various types of music, among which cover songs are included. In order to protect the copyright of music, an algorithm to find the original song of the cover song is essential. However, it is not easy to find the original song because the cover song is a modification of the composition, speed and overall structure of the original song. So far, there is no known effective algorithm for searching the original song of the cover song. In this paper, we propose an algorithm for searching the original song of the cover song using the inflection points of the melody line. Inflection points represent the characteristic points of change in the melody sequence. The proposed algorithm compares the original song and the cover song using the sequence of inflection points for the representative phrase of the original song. Since the characteristics of the representative phrase are used, even if the cover song is a song made by modifying the overall composition of the song, the algorithm's search performance is excellent. Also, since the proposed algorithm uses only the features of the inflection point sequence, the memory usage is very low. The efficiency of the algorithm was verified through performance evaluation.

Enhanced Spectral Hole Substitution for Improving Speech Quality in Low Bit-Rate Audio Coding

  • Lee, Chang-Heon;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3E
    • /
    • pp.131-139
    • /
    • 2010
  • This paper proposes a novel spectral hole substitution technique for low bit-rate audio coding. The spectral holes frequently occurring in relatively weak energy bands due to zero bit quantization result in severe quality degradation, especially for harmonic signals such as speech vowels. The enhanced aacPlus (EAAC) audio codec artificially adjusts the minimum signal-to-mask ratio (SMR) to reduce the number of spectral holes, but it still produces noisy sound. The proposed method selectively predicts the spectral shapes of hole bands using either intra-band correlation, i.e. harmonically related coefficients nearby or inter-band correlation, i.e. previous frames. For the bands that have low prediction gain, only the energy term is quantized and spectral shapes are replaced by pseudo random values in the decoding stage. To minimize perceptual distortion caused by spectral mismatching, the criterion of the just noticeable level difference (JNLD) and spectral similarity between original and predicted shapes are adopted for quantizing the energy term. Simulation results show that the proposed method implemented into the EAAC baseline coder significantly improves speech quality at low bit-rates while keeping equivalent quality for mixed and music contents.

Simulation Method for Playing Complex Polyrhythm (복잡한 폴리리듬을 연주하기 위한 가상(假想)리듬연주법)

  • Kim, Hyounjong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.10
    • /
    • pp.4426-4431
    • /
    • 2012
  • Though polyrhythm is definitely a complex form of rhythm as we all know, some forms of polyrhythm(for example, '2 against 3' or '3 against 4') are considered as simple rhythm compared with others such as '5 against 4' or '7 against 4'. It is because this kind of polyrhythm might be used as often as enough to be able to be understood and to be played. However, the polyrhythm that we are talking about in this study(I'd like to call them as complex polyrhythm) is not able to be played by someone who is not familiar with quintuplets or septuplets. Here I would like to suggest a new way of playing polyrhythm, called the 'Simulation Method'. The main purpose of this method is to easily approach the complex polyrhythm by using its mathematical similarity even though these polyrhythms are originally hard to be played due to its mathematical complexity.

An Automated Technique for Illegal Site Detection using the Sequence of HTML Tags (HTML 태그 순서를 이용한 불법 사이트 탐지 자동화 기술)

  • Lee, Kiryong;Lee, Heejo
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1173-1178
    • /
    • 2016
  • Since the introduction of BitTorrent protocol in 2001, everything can be downloaded through file sharing, including music, movies and software. As a result, the copyright holder suffers from illegal sharing of copyright content. In order to solve this problem, countries have enacted illegal share related law; and internet service providers block pirate sites. However, illegal sites such as pirate bay easily reopen the site by changing the domain name. Thus, we propose a technique to easily detect pirate sites that are reopened. This automated technique collects the domain names using the google search engine, and measures similarity using Longest Common Subsequence (LCS) algorithm by comparing the tag structure of the source web page and reopened web page. For evaluation, we colledted 2,383 domains from google search. Experimental results indicated detection of a total of 44 pirate sites for collected domains when applying LCS algorithm. In addition, this technique detected 23 pirate sites for 805 domains when applied to foreign pirate sites. This experiment facilitated easy detection of the reopened pirate sites using an automated detection system.

A Query by Humming System Using Humming Algebra (허밍 대수를 이용한 허밍 질의처리 시스템)

  • Shin, Je-Yong;Han, Wook-Shin;Lee, Jong-Hak
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.8
    • /
    • pp.534-546
    • /
    • 2009
  • Query by humming is an effective and intuitive querying mechanism when a user wants to find a song without knowing lyrics. The query by humming system takes a user-hummed melody as input, compares it with melodies in a music database, and returns top-k similar melodies to the input. In this paper, we propose a novel algebra for query by humming, and design and implement a real query by humming system called HummingBase by exploiting the algebra. By analyzing existing similarity search techniques, we derive 10 core operators for the algebra. By using the well-defined algebra, we can easily implement such a system in a extensible and modular way. With two case studies, we show that the proposed algebra can easily represent the query processing processes of existing query-by-humming systems.