• Title/Summary/Keyword: Lyric Clustering

Search Result 2, Processing Time 0.023 seconds

Clustering Meta Information of K-Pop Girl Groups Using Term Frequency-inverse Document Frequency Vectorization (단어-역문서 빈도 벡터화를 통한 한국 걸그룹의 음반 메타 정보 군집화)

  • JoonSeo Hyeon;JaeHyuk Cho
    • Journal of Platform Technology
    • /
    • v.11 no.3
    • /
    • pp.12-23
    • /
    • 2023
  • In the 2020s, the K-Pop market has been dominated by girl groups over boy groups and the fourth generation over the third generation. This paper presents methods and results on lyric clustering to investigate whether the generation of girl groups has started to change. We collected meta-information data for 1469 songs of 47 groups released from 2013 to 2022 and classified them into lyric information and non-lyric meta-information and quantified them respectively. The lyrics information was preprocessed by applying word-translation frequency vectorization based on previous studies and then selecting only the top vector values. Non-lyric meta-information was preprocessed and applied with One-Hot Encoding to reduce the bias of using only lyric information and show better clustering results. The clustering performance on the preprocessed data is 129%, 45% higher for Spherical K-Means' Silhouette Score and Calinski-Harabasz Score, respectively, compared to Hierarchical Clustering. This paper is expected to contribute to the study of Korean popular song development and girl group lyrics analysis and clustering.

  • PDF

Highlight based Lyrics Search Considering the Characteristics of Query (사용자 질의어 특징을 반영한 하이라이트 기반 노래 가사 검색)

  • Kim, Kweon Yang
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.4
    • /
    • pp.301-307
    • /
    • 2016
  • This paper proposes a lyric search method to consider the characteristics of the user query. According to the fact that queries for the lyric search are derived from highlight parts of the music, this paper uses the hierarchical agglomerative clustering to find the highlight and proposes a Gaussian weighting to consider the neighbor of the highlight as well as highlight. By setting the mean of a Gaussian weighting at the highlight, this weighting function has higher weights near the highlight and the lower weights far from the highlight. Then, this paper constructs a index of lyrics with the gaussian weighting. According to the experimental results on a data set obtained from 5 real users, the proposed method is proved to be effective.