Search | Korea Science

Prediction of Closed Quotient During Vocal Phonation using GRU-type Neural Network with Audio Signals

Hyeonbin Han;Keun Young Lee;Seong-Yoon Shin;Yoseup Kim;Gwanghyun Jo;Jihoon Park;Young-Min Kim
- Journal of information and communication convergence engineering
- /
- v.22 no.2
- /
- pp.145-152
- /
- 2024
Closed quotient (CQ) represents the time ratio for which the vocal folds remain in contact during voice production. Because analyzing CQ values serves as an important reference point in vocal training for professional singers, these values have been measured mechanically or electrically by either inverse filtering of airflows captured by a circumferentially vented mask or post-processing of electroglottography waveforms. In this study, we introduced a novel algorithm to predict the CQ values only from audio signals. This has eliminated the need for mechanical or electrical measurement techniques. Our algorithm is based on a gated recurrent unit (GRU)-type neural network. To enhance the efficiency, we pre-processed an audio signal using the pitch feature extraction algorithm. Then, GRU-type neural networks were employed to extract the features. This was followed by a dense layer for the final prediction. The Results section reports the mean square error between the predicted and real CQ. It shows the capability of the proposed algorithm to predict CQ values.
https://doi.org/10.56977/jicce.2024.22.2.145 인용 PDF

Compuationally Efficient Propagator Method for DoA with Coprime Array (서로소 배열에서 프로퍼게이터 방법 기반의 효율적인 도래각 추정 기법)

Byun, Bu-Guen;Yoo, Do-Sik
- Journal of Advanced Navigation Technology
- /
- v.20 no.3
- /
- pp.258-264
- /
- 2016
In this paper, we propose a computationally efficient direction of arrival (DoA) estimation algorithm based on propagator method with non-uniform array. While the co-prime array techniques can improve the resolution of DoA, they generally lead to high computational complexity as the length of the coarray aperture. To reduce the complexity we use the propagator method that does not require singular value decomposition (SVD). Through simulations, we compare MUSIC with uniform lineary array, propagator method with uniform linear array, MUSIC with co-prime array, and the proposed scheme and observe that the performance of the proposed scheme is significantly better than MUSIC or propagator method with uniform linear array while it is slightly worse than computationally much more expensive co-prime array MUSIC scheme.
https://doi.org/10.12673/jant.2016.20.3.258 인용 PDF KSCI

A Hybrid Music Recommendation System Combining Listening Habits and Tag Information (사용자 청취 습관과 태그 정보를 이용한 하이브리드 음악 추천 시스템)

Kim, Hyon Hee;Kim, Donggeon;Jo, Jinnam
- Journal of the Korea Society of Computer and Information
- /
- v.18 no.2
- /
- pp.107-116
- /
- 2013
In this paper, we propose a hybrid music recommendation system combining users' listening habits and tag information in a social music site. Most of commercial music recommendation systems recommend music items based on the number of plays and explicit ratings of a song. However, the approach has some difficulties in recommending new items with only a few ratings or recommending items to new users with little information. To resolve the problem, we use tag information which is generated by collaborative tagging. According to the meaning of tags, a weighted value is assigned as the score of a tag of an music item. By combining the score of tags and the number of plays, user profiles are created and collaborative filtering algorithm is executed. For performance evaluation, precision, recall, and F-measure are calculated using the listening habit-based recommendation, the tag score-based recommendation, and the hybrid recommendation, respectively. Our experiments show that the hybrid recommendation system outperforms the other two approaches.
https://doi.org/10.9708/jksci.2013.18.2.107 인용 PDF KSCI

Sentiment Analysis Engine for Cambodian Music Industry Re-building (캄보디아 음악 산업 재건을 위한 감정 분석 엔진 연구)

Khoeurn, Saksonita;Kim, Yun Seon
- Journal of the Korea Society for Simulation
- /
- v.26 no.4
- /
- pp.23-34
- /
- 2017
During Khmer Rouge Regime, Cambodian pop music was completely forgotten since 90% of artists were killed. After recovering from war since 1979, the music started to grow again in 1990. However, Cambodian popular music dynamic and flows are observably directed by the multifaceted socioeconomic, political and creative forces. The major problems are the plagiarism and piracy which have been prevailing for years in the industry. Recently, the consciousness of the need to preserve Khmer original songs from both fans and artist have been increased and become a new trend for Cambodia young population. Still, the music quality is in the limit state. To increase the mind-set, the feedbacks and inspiration are needed. The study suggested a music ranking website using sentiment analysis which data were collected from Production Companies Facebook Pages' posts and comments. The study proposed an algorithm which translates from Khmer to English, doing sentiment analysis and generate the ranking. The result showed 80% accuracy of translation and sentiment analysis on the proposed system. The songs that rank high in the system are the songs which are original and fit the occasion in Cambodia. With the proposed ranking algorithm, it would help to increase the competitive advantage of the musical productions as well as to encourage the producers to compose the new songs which fit the particular activities and event.
https://doi.org/10.9709/JKSS.2017.26.4.023 인용 PDF KSCI

Speech/Music Signal Classification Based on Spectrum Flux and MFCC For Audio Coder (오디오 부호화기를 위한 스펙트럼 변화 및 MFCC 기반 음성/음악 신호 분류)

Sangkil Lee;In-Sung Lee
- The Journal of Korea Institute of Information, Electronics, and Communication Technology
- /
- v.16 no.5
- /
- pp.239-246
- /
- 2023
In this paper, we propose an open-loop algorithm to classify speech and music signals using the spectral flux parameters and Mel Frequency Cepstral Coefficients(MFCC) parameters for the audio coder. To increase responsiveness, the MFCC was used as a short-term feature parameter and spectral fluxes were used as a long-term feature parameters to improve accuracy. The overall voice/music signal classification decision is made by combining the short-term classification method and the long-term classification method. The Gaussian Mixed Model (GMM) was used for pattern recognition and the optimal GMM parameters were extracted using the Expectation Maximization (EM) algorithm. The proposed long-term and short-term combined speech/music signal classification method showed an average classification error rate of 1.5% on various audio sound sources, and improved the classification error rate by 0.9% compared to the short-term single classification method and 0.6% compared to the long-term single classification method. The proposed speech/music signal classification method was able to improve the classification error rate performance by 9.1% in percussion music signals with attacks and 5.8% in voice signals compared to the Unified Speech Audio Coding (USAC) audio classification method.
https://doi.org/10.17661/jkiiect.2023.16.5.239 인용 PDF HTML

Automatic Emotion Classification of Music Signals Using MDCT-Driven Timbre and Tempo Features

Kim, Hyoung-Gook;Eom, Ki-Wan
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.2E
- /
- pp.74-78
- /
- 2006
This paper proposes an effective method for classifying emotions of the music from its acoustical signals. Two feature sets, timbre and tempo, are directly extracted from the modified discrete cosine transform coefficients (MDCT), which are the output of partial MP3 (MPEG 1 Layer 3) decoder. Our tempo feature extraction method is based on the long-term modulation spectrum analysis. In order to effectively combine these two feature sets with different time resolution in an integrated system, a classifier with two layers based on AdaBoost algorithm is used. In the first layer the MDCT-driven timbre features are employed. By adding the MDCT-driven tempo feature in the second layer, the classification precision is improved dramatically.
PDF KSCI

Beamforming-based Partial Field Decomposition in Acoustical Holography (음향 홀로-그래피에서 빔 형성을 이용한 부분 음장 분리)

황의석;조영만;강연준
- Transactions of the Korean Society for Noise and Vibration Engineering
- /
- v.11 no.6
- /
- pp.200-207
- /
- 2001
In this paper, a new method for Partial field decomposition is developed that is based on the beamforming algorithm for the application of acoustical holography to a composite sound field generated by multiple incoherent sound sources. In the proposed method, source Positions are first predicted by MUSIC(multiple signal classification) algorithm. The composite sound fields can then be decomposed into each partial field by the beamforming. Results of both numerical simulations and experiments show that the method can find each partial field very accurately and effectively, and that it also has Potential to be used for application to distributed sources.
PDF

What Do The Algorithms of The Online Video Platform Recommend: Focusing on Youtube K-pop Music Video (온라인 동영상 플랫폼의 알고리듬은 어떤 연관 비디오를 추천하는가: 유튜브의 K POP 뮤직비디오를 중심으로)

Lee, Yeong-Ju;Lee, Chang-Hwan
- The Journal of the Korea Contents Association
- /
- v.20 no.4
- /
- pp.1-13
- /
- 2020
In order to understand the recommendation algorithm applied to the online video platform, this study examines the relationship between the content characteristics of K-pop music videos and related videos recommended for playback on YouTube, and analyses which videos are recommended as related videos through network analysis. As a result, the more liked videos, the higher recommendation ranking and most of the videos belonging to the same channel or produced by the same agency were recommended as related videos. As a result of the network analysis of the related video, the network of K-pop music video is strongly formed, and the BTS music video is highly centralized in the network analysis of the related video. These results suggest that the network between K-pops is strong, so when you enter K-pop as a search query and watch videos, you can enjoy K-pop continuously. But when watching other genres of video, K-pop may not be recommended as a related video.
https://doi.org/10.5392/JKCA.2020.20.04.001 인용 PDF KSCI HTML

Ship Positioning Estimation Using Phased Array Antenna in FMCW Radar System for Small-Sized Ships (소형 선박용 FMCW 레이더 시스템에서의 위상 배열 안테나를 사용한 선박의 위치 추정)

Lee, Seongwook;Lee, Seong Ro;Kim, Seong-Cheol
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.40 no.6
- /
- pp.1130-1141
- /
- 2015
Conventionally, a pulse radar is used for middle-sized or large-sized ships to detect other ships or obstacles located at a long distance. However, it is hardly equipped for most of the small-sized ships due to mounting and maintenance costs. Therefore, FMCW(frequency modulated continuous wave) radar is suggested as an alternative for the small-sized ships. Since it operates with low power and has good range resolution for relatively close objects, it is eligible for the small-sized ships. In previously proposed FMCW radar system, it only estimates distance and velocity of a target ship placed in the direction of main beam and is hard to detect several ships simultaneously. Thus, we suggest the method for detecting several ships at the same time by applying MUSIC(multiple signal classification) algorithm to FMCW radar signal received by a phased array antenna. In addition, by combining digital beam forming with the MUSIC algorithm, better angle resolution is achievable.
https://doi.org/10.7840/kics.2015.40.6.1130 인용 PDF KSCI

Improving SVM with Second-Order Conditional MAP for Speech/Music Classification (음성/음악 분류 향상을 위한 2차 조건 사후 최대 확률기법 기반 SVM)

Lim, Chung-Soo;Chang, Joon-Hyuk
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.48 no.5
- /
- pp.102-108
- /
- 2011
Support vector machines are well known for their outstanding performance in pattern recognition fields. One example of their applications is music/speech classification for a standardized codec such as 3GPP2 selectable mode vocoder. In this paper, we propose a novel scheme that improves the speech/music classification of support vector machines based on the second-order conditional maximum a priori. While conventional support vector machine optimization techniques apply during training phase, the proposed technique can be adopted in classification phase. In this regard, the proposed approach can be developed and employed in parallel with conventional optimizations, resulting in synergistic boost in classification performance. According to experimental results, the proposed algorithm shows its compatibility and potential for improving the performance of support vector machines.
PDF KSCI

Search Result 351, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)