• Title/Summary/Keyword: music enhancement

Search Result 48, Processing Time 0.026 seconds

Frequency Range Enhancement for Faster Convergence of Neural Music Source Separation Systems (신경망 기반 음원 분리 시스템의 학습 속도 향상을 위한 음역대 강조 기법)

  • Kim, Min-Seok;Choi, Woo-Sung;Jung, Soon-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.567-569
    • /
    • 2020
  • 여러 악기가 섞여 있는 음원으로부터 원하는 악기 소리를 추출하는 음원 분리 기법 중 최근 신경망 기반 시스템이 활발히 연구되고 있다. 악기마다 고유의 음역대를 가진다는 사실에 감안하여, 연구진은 기존 음원 분리 신경망에 적은 수의 학습 파라미터를 추가하여 학습 속도를 대폭 향상시킬 수 있는 음역대 강조 기법을 제안한다.

Exploring the Characteristics of Environmental Catalysts of the Disadvantaged Gifted in Music (사회적 배려대상 음악영재의 환경요인 특징 탐색)

  • Kim, Sunghye;Lee, Kyungjin
    • Journal of Gifted/Talented Education
    • /
    • v.24 no.4
    • /
    • pp.629-655
    • /
    • 2014
  • This study aims to explore the characteristics of environmental catalysts which have affected the development of music giftedness of the disadvantaged students. For this purpose, this study deals with nineteen disadvantaged gifted in music and examines their self-evaluation test, personal statement, and interview. Based on Gagn$\acute{e}$'s environmental catalysts of differentiated model of giftedness and talent(DMGT), the analysis of the interviews conveys the milieu of the disadvantaged gifted hardly exerts positive influences on their musical activities and studies. While concerning music and supporting their children financially and emotionally, parents unintentionally tend to exert negative influences on their children for their misapprehension of giftedness and incompetent advice. On the whole, the disadvantaged gifted hardly admit their teachers as experts in music. In relation to provisions, most students participate in extra school and local program and none of them participates in music gifted program. They are not satisfied with the quality in education. Despite the importance of the events such as crystallizing experience, award-winning, and performance, most students don't have enough events for inspiring their giftedness. As a conclusion, this study gives a proposition for a strategy to improve the environmental catalysts for the disadvantaged gifted in many different ways: the improvement of social recognition, the enhancement of parent consulting and teachers training programs, and the development and diffusions of more qualified gifted programs and so on.

Speech Enhancement for Voice commander in Car environment (차량환경에서 음성명령어기 사용을 위한 음성개선방법)

  • 백승권;한민수;남승현;이봉호;함영권
    • Journal of Broadcast Engineering
    • /
    • v.9 no.1
    • /
    • pp.9-16
    • /
    • 2004
  • In this paper, we present a speech enhancement method as a pre-processor for voice commander under car environment. For the friendly and safe use of voice commander in a running car, non-stationary audio signals such as music and non-candidate speech should be reduced. Ow technique is a two microphone-based one. It consists of two parts Blind Source Separation (BSS) and Kalman filtering. Firstly, BSS is operated as a spatial filter to deal with non-stationary signals and then car noise is reduced by kalman filtering as a temporal filter. Algorithm Performance is tested for speech recognition. And the results show that our two microphone-based technique can be a good candidate to a voice commander.

Multi-channel Speech Enhancement Using Blind Source Separation and Cross-channel Wiener Filtering

  • Jang, Gil-Jin;Choi, Chang-Kyu;Lee, Yong-Beom;Kim, Jeong-Su;Kim, Sang-Ryong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.2E
    • /
    • pp.56-67
    • /
    • 2004
  • Despite abundant research outcomes of blind source separation (BSS) in many types of simulated environments, their performances are still not satisfactory to be applied to the real environments. The major obstacle may seem the finite filter length of the assumed mixing model and the nonlinear sensor noises. This paper presents a two-step speech enhancement method with multiple microphone inputs. The first step performs a frequency-domain BSS algorithm to produce multiple outputs without any prior knowledge of the mixed source signals. The second step further removes the remaining cross-channel interference by a spectral cancellation approach using a probabilistic source absence/presence detection technique. The desired primary source is detected every frame of the signal, and the secondary source is estimated in the power spectral domain using the other BSS output as a reference interfering source. Then the estimated secondary source is subtracted to reduce the cross-channel interference. Our experimental results show good separation enhancement performances on the real recordings of speech and music signals compared to the conventional BSS methods.

Using Focus Group Interview to Explore the Effectiveness of Adolescent Smoking Cessation Program with Music Therapy (음악중재 청소년 금연교실 파일럿 연구: 포커스 그룹 인터뷰)

  • HwangMyung, Hee-Song
    • Korean Journal of Health Education and Promotion
    • /
    • v.27 no.4
    • /
    • pp.131-139
    • /
    • 2010
  • Objectives: This pilot study was designed to examine whether the adolescent smoking cessation program with harmonica therapy was effective or not. It was qualitatively explored perceived smoking consequences, cessation and relapse experience, specific harmonica help to overcome smoking urge, preference of harmonica toward cessation, and harmonica intervention planning to quit. Methods: The treatment program was conducted 30-minute, 6-session, and once a week basis. Qualitative data using Focus Group Interview were collected at the completion of the program with 6 participants, and analyzed by Krueger's systematic process. Results: Participants were smoking daily and consumed 3-10 cigarettes. They recognized undesirable smoking consequences in terms of cost, interpersonal relationship, and health that might lead to cessation attempts in the past. Participants who did not want to quit smoking at the program beginning changed their attitude to quit after exploring partial cessation efforts with the help of harmonica therapy. They believe harmonica's consistent help of quitting and leading to success. Conclusion: Adolescent attitudinal change toward smoking cessation has promising insight of motivation enhancement through harmonica therapy that was a major barrier to successful quit.

Preprocessing method for enhancing digital audio quality in speech communication system (음성통신망에서 디지털 오디오 신호 음질개선을 위한 전처리방법)

  • Song Geun-Bae;Ahn Chul-Yong;Kim Jae-Bum;Park Ho-Chong;Kim Austin
    • Journal of Broadcast Engineering
    • /
    • v.11 no.2 s.31
    • /
    • pp.200-206
    • /
    • 2006
  • This paper presents a preprocessing method to modify the input audio signals of a speech coder to obtain the finally enhanced signals at the decoder. For the purpose, we introduce the noise suppression (NS) scheme and the adaptive gain control (AGC) where an audio input and its coding error are considered as a noisy signal and a noise, respectively. The coding error is suppressed from the input and then the suppressed input is level aligned to the original input by the following AGC operation. Consequently, this preprocessing method makes the spectral energy of the music input redistributed all over the spectral domain so that the preprocessed music can be coded more effectively by the following coder. As an artifact, this procedure needs an additional encoding pass to calculate the coding error. However, it provides a generalized formulation applicable to a lot of existing speech coders. By preference listening tests, it was indicated that the proposed approach produces significant enhancements in the perceived music qualities.

Intelligibility Enhancement of Multimedia Contents Using Spectral Shaping (스펙트럼 성형기법을 이용한 멀티미디어 콘텐츠의 명료도 향상)

  • Ji, Youna;Park, Young-cheol;Hwang, Young-su
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.11
    • /
    • pp.82-88
    • /
    • 2016
  • In this paper, we propose an intelligibility enhancement algorithm for multimedia contents using spectral shaping. The dialogue signals is essential to understand the plot of audio-visual media contents such as movie and TV. However, the non-dialogue components as like sound effects and background music often degrade the dialogue clarity. To overcome this problem, this paper tries to improves the dialogue clarity of audio soundtracks which contain important cues for the visual scenes. In the proposed method, the dialogue components are first detected by soft masker based on speech presence probability (SPP) which is widely used in speech enhancement field. Then, extracted dialogue signals are applied to the spectral shaping method. It reallocate the spectral-temporal energy of speech to enhanced the intelligibility. The total energy is maintained as unchanged via a loudness normalization process to prevent saturation. The algorithm was evaluated using the modeled and real movie soundtracks and it was shown that the proposed algorithm enhances the dialogue clarity while preserving the total audio power.

Stereo Sound Image Expansion Using Phase Difference and Sound Pressure Level Difference in Television (위상차와 음압 레벨차를 이용한 텔레비전에서의 스테레오 음상 확대)

  • 박해광;오제화
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1243-1246
    • /
    • 1998
  • Three-dimensional(3-D) sound is a technique for generating or recreating sounds so they are perceived as emanating from locations in a three-dimensional space. Three dimensional sound has the potential of increasing the feeling of realism in music or movie soundtracks. Three-dimensional sound effects depend on psychoacoustic spectral and phase cues being presented in a reproduced signal. In this paper we propose an effective algorithm for the sound image expansion in television system using stereo image enhancement techniques. Compared to the other techniques of three-dimensional sound, the proposed algorithm use only two speakers to enhance the sound image expansion, while maintaining the original sound characteristics.

  • PDF

Design of Multimedia Contents and Internet Search System for Passenger in Train (열차 승객을 위한 멀티미디어콘텐츠 및 인터넷 검색 시스템 설계에 관한 연구)

  • Chang, Duk-Jin;Kang, Song-Hee;Park, Hyun-Hyu;Kang, Dae-Ho;Heo, Jae-Seok;Song, Dahl-Ho
    • Proceedings of the KSR Conference
    • /
    • 2010.06a
    • /
    • pp.442-447
    • /
    • 2010
  • For the remarkable enhancement of high-speed rail passenger services, a system which provides various multimedia contents and Internet search functions was designed. The system gets inputs from a passenger and displays various multimedia contents on the touch sensitive LCD panel attached on a passenger seat. This kind of service is new in Korea and not easy to find in other countries, either. In this paper, we presented a design of a system which provides not only one-way broadcasting services but also searching capability of various information interactively. Informations to be provided are train schedule, transfer information, tour information, E-books, movies, music, ets. Successful completion of the system development in the following years is expected to strengthen international competitiveness of Korean railway industry.

  • PDF

A study on combination of loss functions for effective mask-based speech enhancement in noisy environments (잡음 환경에 효과적인 마스크 기반 음성 향상을 위한 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.3
    • /
    • pp.234-240
    • /
    • 2021
  • In this paper, the mask-based speech enhancement is improved for effective speech recognition in noise environments. In the mask-based speech enhancement, enhanced spectrum is obtained by multiplying the noisy speech spectrum by the mask. The VoiceFilter (VF) model is used as the mask estimation, and the Spectrogram Inpainting (SI) technique is used to remove residual noise of enhanced spectrum. In this paper, we propose a combined loss to further improve speech enhancement. In order to effectively remove the residual noise in the speech, the positive part of the Triplet loss is used with the component loss. For the experiment TIMIT database is re-constructed using NOISEX92 noise and background music samples with various Signal to Noise Ratio (SNR) conditions. Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI) are used as the metrics of performance evaluation. When the VF was trained with the mean squared error and the SI model was trained with the combined loss, SDR, PESQ, and STOI were improved by 0.5, 0.06, and 0.002 respectively compared to the system trained only with the mean squared error.