통합 검색 | Korea Science

A Novel Query-by-Singing/Humming Method by Estimating Matching Positions Based on Multi-layered Perceptron

Pham, Tuyen Danh;Nam, Gi Pyo;Shin, Kwang Yong;Park, Kang Ryoung
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제7권7호
- /
- pp.1657-1670
- /
- 2013
The increase in the number of music files in smart phone and MP3 player makes it difficult to find the music files which people want. So, Query-by-Singing/Humming (QbSH) systems have been developed to retrieve music from a user's humming or singing without having to know detailed information about the title or singer of song. Most previous researches on QbSH have been conducted using musical instrument digital interface (MIDI) files as reference songs. However, the production of MIDI files is a time-consuming process. In addition, more and more music files are newly published with the development of music market. Consequently, the method of using the more common MPEG-1 audio layer 3 (MP3) files for reference songs is considered as an alternative. However, there is little previous research on QbSH with MP3 files because an MP3 file has a different waveform due to background music and multiple (polyphonic) melodies compared to the humming/singing query. To overcome these problems, we propose a new QbSH method using MP3 files on mobile device. This research is novel in four ways. First, this is the first research on QbSH using MP3 files as reference songs. Second, the start and end positions on the MP3 file to be matched are estimated by using multi-layered perceptron (MLP) prior to performing the matching with humming/singing query file. Third, for more accurate results, four MLPs are used, which produce the start and end positions for dynamic time warping (DTW) matching algorithm, and those for chroma-based DTW algorithm, respectively. Fourth, two matching scores by the DTW and chroma-based DTW algorithms are combined by using PRODUCT rule, through which a higher matching accuracy is obtained. Experimental results with AFA MP3 database show that the accuracy (Top 1 accuracy of 98%, with an MRR of 0.989) of the proposed method is much higher than that of other methods. We also showed the effectiveness of the proposed system on consumer mobile device.
https://doi.org/10.3837/tiis.2013.07.008 인용 PDF KSCI

뇌파/뇌자도 전류원 국지화의 공간분해능 향상을 위한 독립성분분석 기반의 부분공간 탐색 알고리즘 (An ICA-Based Subspace Scanning Algorithm to Enhance Spatial Resolution of EEG/MEG Source Localization)

정영진;권기운;임창환
- 대한의용생체공학회:의공학회지
- /
- 제31권6호
- /
- pp.456-463
- /
- 2010
In the present study, we proposed a new subspace scanning algorithm to enhance the spatial resolution of electroencephalography (EEG) and magnetoencephalography(MEG) source localization. Subspace scanning algorithms, represented by the multiple signal classification (MUSIC) algorithm and the first principal vector (FINE) algorithm, have been widely used to localize asynchronous multiple dipolar sources in human cerebral cortex. The conventional MUSIC algorithm used principal component analysis (PCA) to extract the noise vector subspace, thereby having difficulty in discriminating two or more closely-spaced cortical sources. The FINE algorithm addressed the problem by using only a part of the noise vector subspace, but there was no golden rule to determine the number of noise vectors. In the present work, we estimated a non-orthogonal signal vector set using independent component analysis (ICA) instead of using PCA and performed the source scanning process in the signal vector subspace, not in the noise vector subspace. Realistic 2D and 3D computer simulations, which compared the spatial resolutions of various algorithms under different noise levels, showed that the proposed ICA-MUSIC algorithm has the highest spatial resolution, suggesting that it can be a useful tool for practical EEG/MEG source localization.
https://doi.org/10.9718/JBER.2010.31.6.456 인용 PDF KSCI

Imaginary Soundscape 기반의 딥러닝을 활용한 회화와 음악의 매칭 및 다중 감각을 이용한 융합적 평가 방법 (Convergence evaluation method using multisensory and matching painting and music using deep learning based on imaginary soundscape)

정하영;김영준;조준동
- 한국융합학회논문지
- /
- 제11권11호
- /
- pp.175-182
- /
- 2020
본 연구에서는 회화 감상에 도움이 되는 사운드스케이프를 구성하기 위해 딥러닝 기술을 활용하여 클래식 음악을 매칭하는 기술을 소개하고 회화와 음악 매칭이 얼마나 잘 되었는지에 대해 평가할 수 있는 평가 지표를 제안한다. 평가 지표는 리커드 5점 척도를 통한 적합도 평가와 멀티모달 측면의 평가로 진행하였다. 회화와 음악 매칭에 대해 13명의 실험 참가자의 적합도 평가의 점수는 3.74/5.0 이었고, 또한 13명의 실험 참가자의 멀티모달 평가에서 회화와 음악 매칭의 코사인 유사도의 평균은 0.79였다. 멀티모달적 평가는 새로운 사용자 경험을 측정할 수 있는 평가 지표가 될 것으로 기대된다. 또한 본 연구를 통해 시각과 청각의 인터랙션을 제안함으로써 다중감각 예술작품 경험을 향상시키고자 하였다. 본 연구에서 제안된 회화와 음악 매칭이 다중감각 예술작품 전시에서 활용되며 더 나아가 이는 시각 장애인들의 예술작품 감상에 대한 접근성을 높일 수 있을 것이라 기대한다.
https://doi.org/10.15207/JKCS.2020.11.11.175 인용 PDF KSCI

감성기반 음악.이미지 검색 추천 시스템 설계 및 구현 (A Design and Implementation of Music & Image Retrieval Recommendation System based on Emotion)

김태연;송병호;배상현
- 전자공학회논문지CI
- /
- 제47권1호
- /
- pp.73-79
- /
- 2010
감성 지능형 컴퓨팅은 컴퓨터가 학습과 적응을 통하여 인간의 감성을 처리할 수 있는 감성인지 능력을 갖는 것으로 보다 효율적인 인간과 컴퓨터의 상호 작용을 가능하게 한다. 감성 정보들 중 시각과 청각 정보인 음악 이미지는 짧은 시간에 형성되고 기억에 오랫동안 지속되기 때문에 성공적인 마케팅에 있어서 중요한 요인으로 꼽히고 있으며, 인간의 정서를 이해하고 해석하는데 있어서 매우 중요한 역할을 한다. 본 논문에서는 사용자의 감성키워드(짜증, 우울, 차분, 기쁨)를 고려하여 매칭된 음악과 이미지를 검색하는 시스템을 구축하였다. 제안된 시스템은 인간의 감성을 4단계 경우로 상황을 정의하며, 정규화 된 음악과 이미지를 검색하기 위해 음악 이미지 온톨로지와 감성 온톨로지를 사용하였으며, 이미지의 특징정보를 추출, 유사성을 측정하여 원하는 결과를 얻게 하도록 하였다. 또한, 이미지 감성인식정보를 분류하기위해 대응일치분석과 요인분석을 통한 성컬러와 감성어휘를 하나의 공간에 매칭하였다. 실험결과 제안된 시스템은 4가지 감성상태에 대해 82.4%의 매칭율를 가져올 수 있었다.
PDF KSCI

간단한 신호 부공간 추정을 통한 MUSIC 기반의 효과적인 도래방향 탐지 (MUSIC-Based Direction Finding through Simple Signal Subspace Estimation)

최양호
- 대한전자공학회논문지SP
- /
- 제48권4호
- /
- pp.153-159
- /
- 2011
MUSIC(MUltiple SIgnal Classification)은 신호부공간과 잡음부공간이 서로 직교한다는 사실에 기초하여 센서 어레이에 입사하는 신호의 도래방향을 추정한다. 잡음 부공간에 대한 기저(basis)를 구하기 위해 샘플행렬을 고유분해하며, 이에 따라 많은 계산량을 요구한다. 본 논문에서는 샘플행렬의 열벡터(column vectors)에서 잡음전력을 제거하여 신호 부공간에 대한 기저벡터를 구해 간단히 도래각을 추정하는 방법을 제시한다. 추정된 기저벡터를 이용하여 비용함수를 정의하고, 비용함수의 최소점을 찾아 도래각을 추정한다. 비용함수의 최소점은 격자 간격으로 나누어 계산하는 grid 방법이 아닌, 포물선 보간법(parabolic interpolation)에 기초한 Brent 방법을 적용하여 효과적으로 구해진다. 시뮬레이션 결과에 따르면, 제안방식은 샘플행렬 고유분해에 의존하는 기존방식과 실질적으로 같은 성능을 가짐을 보인다.
PDF KSCI

조용필 음악 50년의 한국 대중음악사적 의의 연구 (Cho Yong-Pil's 50 years of Music and the Korean Popular Music History)

최현우;양은영
- 융합정보논문지
- /
- 제8권4호
- /
- pp.199-204
- /
- 2018
본 연구의 목적은 데뷔 50년을 맞은 조용필 음악의 특징을 분석하고 한국 대중음악사적 의의를 평가하는 것이다. 조용필이 활동한 1960년대 후반부터 현재까지의 대중음악사는 세 시기로 구분된다. 1960년대 후반부터 1980년대 초반까지는 기성세대의 트로트와 젊은이의 록이 유행했던 시기, 1980년대 중반부터 2000년대 초반까지는 소프트 록과 헤비메탈, 발라드가 혼재했던 시기, 2000대 후반부터 현재까지는 후크 송 중심의 댄스음악이 음악시장을 석권한 시기이다. 음악사의 변곡점마다 조용필은 다양한 음악 장르를 넘나들며 대한민국 대중음악의 발전에 기여하였다. 첫 번째 시기에는 트로트와 록을 접목하여 한국 고유의 정서를 담은 록음악을 발표하였고, 두 번째 시기에는 록의 전성기와 발라드의 태동기에서 두 장르 모두의 발전에 기여한다. 세 번째 시기에는 후크송을 발표해 논란을 일으키기도 하였으나 댄스 음악 장르에서도 보컬의 가창력이 중시되는 분위기를 조성하는 데 기여하였다.
https://doi.org/10.22156/CS4SMB.2018.8.4.199 인용 PDF KSCI

A Study on the Development for 3D Audio Generation Machine

Kim Sung-Eun;Kim Myong-Hee;Park Man-Gon
- 한국멀티미디어학회논문지
- /
- 제8권6호
- /
- pp.807-813
- /
- 2005
The production and authoring of digital multimedia contents are most important fields in multimedia technology. Nowadays web-based technology and related multimedia software technology are growing in the IT industry and these technologies are evolving most rapidly in our life. The technology of digital audio and video processing is utilizing rapidly to improve quality of our life, Also we are more interested in high sense and artistic feeling in the music and entertainment areas by use of three dimensional (3D) digital sound technology continuously as well as 3D digital video technology. The service field of digital audio contents is increasing rapidly through the Internet. And the society of Internet users wants the audio contents service with better quality. Recently Internet users are not satisfying the sound quality with 2 channels stereo but seeking the high quality of sound with 5,] channels such as 3D audio of the movie films. But it might be needed proper hardware equipments for the service of 3D sound to satisfy this demand. In this paper, we expand the simple 3D audio generator developed and propose a web-based music bank by the software development of 3D audio generation player in 3D sound environment with two speakers minimizing hardware equipments, Also we believe that this study would contribute greatly to digital 3D sound service of high quality for music and entertainment mania.
PDF

Touch-Face 기반 에듀테인먼트 콘텐츠 (Edutainment contents using Touch-Face)

송대현;박재완;이칠우
- 한국콘텐츠학회:학술대회논문집
- /
- 한국콘텐츠학회 2008년도 춘계 종합학술대회 논문집
- /
- pp.363-366
- /
- 2008
본 논문에서는 유아를 대상으로 한 에듀테인먼트 콘텐츠에 대하여 기술한다. 우리나라 전통 음악인 국악을 사용자가 실제 악기가 아닌 가상 악기를 이용하여 체험해 볼 수 있고, 국악에 하나인 판소리를 따라 부를 수 있게 한다. 또한 전통문양과 기본도형을 이용한 놀이를 직감적 도구인 손을 이용하여 디스플레이를 직접 터치할 수 있는 지능형 인터페이스 플랫폼인 Touch-Face를 기반으로 조작이 익숙하지 않는 유아 및 아동들이 쉽고 편리하게 조작할 수 있게 하였다. 이러한 기반 위에 교육적 콘텐츠는 보다 더 유아 교육에 있어서 효율적으로 학습하는데 도움이 될 수 있다.
PDF

HMM 기반 TTS와 MusicXML을 이용한 노래음 합성 (Singing Voice Synthesis Using HMM Based TTS and MusicXML)

칸 나지브 울라;이정철
- 한국컴퓨터정보학회논문지
- /
- 제20권5호
- /
- pp.53-63
- /
- 2015
노래음 합성이란 주어진 가사와 악보를 이용하여 컴퓨터에서 노래음을 생성하는 것이다. 텍스트/음성 변환기에 널리 사용된 HMM 기반 음성합성기는 최근 노래음 합성에도 적용되고 있다. 그러나 기존의 구현방법에는 대용량의 노래음 데이터베이스 수집과 학습이 필요하여 구현에 어려움이 있다. 또한 기존의 상용 노래음 합성시스템은 피아노 롤 방식의 악보 표현방식을 사용하고 있어 일반인에게는 익숙하지 않으므로 읽기 쉬운 표준 악보형식의 사용자 인터페이스를 지원하여 노래 학습의 편의성을 향상시킬 필요가 있다. 이 문제를 해결하기 위하여 본 논문에서는 기존 낭독형 음성합성기의 HMM 모델을 이용하고 노래음에 적합한 피치값과 지속시간 제어방법을 적용하여 HMM 모델 파라미터 값을 변화시킴으로서 노래음을 생성하는 방법을 제안한다. 그리고 음표와 가사를 입력하기 위한 MusicXML 기반의 악보편집기를 전단으로, HMM 기반의 텍스트/음성 변환 합성기를 합성기 후단으로서 사용하여 노래음 합성시스템을 구현하는 방법을 제안한다. 본 논문에서 제안하는 방법을 이용하여 합성된 노래음을 평가하였으며 평가결과 활용 가능성을 확인하였다.
https://doi.org/10.9708/jksci.2015.20.5.053 인용 PDF KSCI

음악 분석을 이용한 클라이언트 중심의 키프레임 생성 시스템 (Client-driven Animated Keyframe Generation System Using Music Analysis)

무즈타바 굴람;김선대;박은수;김승환;유재성;류은석
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송∙미디어공학회 2019년도 하계학술대회
- /
- pp.173-175
- /
- 2019
Animated images formats such as WebP are highly portable graphics formats that are being used everywhere on the Internet. Despite their small sizes and duration, WebP image previews the video without watching the entire content with minimum bandwidth. This paper proposed a novel method to generate personalized WebP images in the client side using its computation resources. The proposed system automatically extracts the WebP image from climax point using music analysis. Based on user interest, the system predicts the genre using Convolutional Neural Network (CNN). The proposed method can easily integrate with streaming platforms such as YouTube, Netflix, Hulu, and others.
PDF

검색결과 612건 처리시간 0.024초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)