• Title/Summary/Keyword: Audio Signal Processing

Search Result 156, Processing Time 0.029 seconds

Development of Integrated Mixer Controller for Digital Public Address (디지털전관방송을 위한 통합믹서컨트롤러 개발)

  • Cho, Juphil;Kim, Kwan-Woong;Kim, Daeik
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.1
    • /
    • pp.19-24
    • /
    • 2017
  • Nowadays, based on the advancement of IT techniques, innovative products combining IT techniques to PA system are developing. In this paper, we presented the hybrid mixer controller for digital PA system. We develop the integrated mixer controller which includes the digital mixer composing an existing digital PA system and function of digital integrated controller. Developed integrated mixer controller consists of multichannel mixer function with 16 audio input channels, 8 output channels. And, it has an equalizer for processing digital audio signal, matrix and limiter. Also, the developed controller has some features such as internet connection for controlling of overall PA system and remote monitoring of mixer process condition.

Prediction of Closed Quotient During Vocal Phonation using GRU-type Neural Network with Audio Signals

  • Hyeonbin Han;Keun Young Lee;Seong-Yoon Shin;Yoseup Kim;Gwanghyun Jo;Jihoon Park;Young-Min Kim
    • Journal of information and communication convergence engineering
    • /
    • v.22 no.2
    • /
    • pp.145-152
    • /
    • 2024
  • Closed quotient (CQ) represents the time ratio for which the vocal folds remain in contact during voice production. Because analyzing CQ values serves as an important reference point in vocal training for professional singers, these values have been measured mechanically or electrically by either inverse filtering of airflows captured by a circumferentially vented mask or post-processing of electroglottography waveforms. In this study, we introduced a novel algorithm to predict the CQ values only from audio signals. This has eliminated the need for mechanical or electrical measurement techniques. Our algorithm is based on a gated recurrent unit (GRU)-type neural network. To enhance the efficiency, we pre-processed an audio signal using the pitch feature extraction algorithm. Then, GRU-type neural networks were employed to extract the features. This was followed by a dense layer for the final prediction. The Results section reports the mean square error between the predicted and real CQ. It shows the capability of the proposed algorithm to predict CQ values.

A Study on Acoustic Sound Tracking System on 2-Dimensional Plain (2차원적 음원추적에 관한 연구)

  • 문성배;전승환
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 1996.09a
    • /
    • pp.117-124
    • /
    • 1996
  • When navigating in or near an area of restricted visibility it is necessary to be heard the whistle bell and/or the siren of lighthouses or ships at times. Even though we can get the brief informations about the property of sound the direction and range of a sound radiator it is not easy to get the accurate informations for decision making. generally the audio frequency is known as 16-20,000Hz but the earshot is shorten and discrimination of sound is more difficult when there is some noise. The sound pressure is 60dB at the moment when human speaks 1 meter away. Usually the noise pressure in a silent room is 40dB and 60dB on the quiet street. In this study we suggest the basic algorithm to trace the direction and range of the source radiator using the signal received through not a physical sense but the microphone sensors and a series of signal of signal processing.

  • PDF

Area-wise relational knowledge distillation

  • Sungchul Cho;Sangje Park;Changwon Lim
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.5
    • /
    • pp.501-516
    • /
    • 2023
  • Knowledge distillation (KD) refers to extracting knowledge from a large and complex model (teacher) and transferring it to a relatively small model (student). This can be done by training the teacher model to obtain the activation function values of the hidden or the output layers and then retraining the student model using the same training data with the obtained values. Recently, relational KD (RKD) has been proposed to extract knowledge about relative differences in training data. This method improved the performance of the student model compared to conventional KDs. In this paper, we propose a new method for RKD by introducing a new loss function for RKD. The proposed loss function is defined using the area difference between the teacher model and the student model in a specific hidden layer, and it is shown that the model can be successfully compressed, and the generalization performance of the model can be improved. We demonstrate that the accuracy of the model applying the method proposed in the study of model compression of audio data is up to 1.8% higher than that of the existing method. For the study of model generalization, we demonstrate that the model has up to 0.5% better performance in accuracy when introducing the RKD method to self-KD using image data.

Development of Processing System for Audio-vision System Based on Auditory Input (청각을 이용한 시각 재현 시스템의 개발)

  • Kim, Jung-Hun;Kim, Deok-Kyu;Won, Chul-Ho;Lee, Jong-Min;Lee, Hee-Jung;Lee, Na-Hee;Yoon, Su-Young
    • Journal of Biomedical Engineering Research
    • /
    • v.33 no.1
    • /
    • pp.25-31
    • /
    • 2012
  • The audio vision system was developed for visually impaired people and usability was verified. In this study ten normal volunteers were included in the subject group and their mean age was 28.8 years old. Male and female ratio was 7:3. The usability of audio vision system was verified by as follows. First, volunteers learned distance of obstacles and up-down discrimination. After learning of audio vision system, indoor and outdoor walking examination was performed. The test was scored by ability of up-down and lateral discrimination, distance recognition and walking without collision. Each parameter was scored by 1 to 5. The results were 93.5 +- SD(ranges, 86 to 100) of 100. In this study, we could convert visual information to auditory information by audio-vision system and verified possibility of applying to daily life for visually impaired people.

Similar Movie Contents Retrieval Using Peak Features from Audio (오디오의 Peak 특징을 이용한 동일 영화 콘텐츠 검색)

  • Chung, Myoung-Bum;Sung, Bo-Kyung;Ko, Il-Ju
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.11
    • /
    • pp.1572-1580
    • /
    • 2009
  • Combing through entire video files for the purpose of recognizing and retrieving matching movies requires much time and memory space. Instead, most current similar movie-matching methods choose to analyze only a part of each movie's video-image information. Yet, these methods still share a critical problem of erroneously recognizing as being different matching videos that have been altered only in resolution or converted merely with a different codecs. This paper proposes an audio-information-based search algorithm by which similar movies can be identified. The proposed method prepares and searches through a database of movie's spectral peak information that remains relatively steady even with changes in the bit-rate, codecs, or sample-rate. The method showed a 92.1% search success rate, given a set of 1,000 video files whose audio-bit-rate had been altered or were purposefully written in a different codec.

  • PDF

Implementation of Public Address System Using Anchor Technology

  • Seungwon Lee;Soonchul Kwon;Seunghyun Lee
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.1-12
    • /
    • 2023
  • A public address (PA) system installed in a building is a system that delivers alerts, announcements, instructions, etc. in an emergency or disaster situation. As for the products used in PA systems, with the development of information and communication technology, PA products with various functions have been introduced to the market. PA systems recently launched in the market may be connected through a single network to enable efficient management and operation, or use voice recognition technology to deliver quick information in case of an emergency. In addition, a system capable of locating a user inside a building using a location-based service and guiding or responding to a safe area in the event of an emergency is being launched on the market. However, the new PA systems currently on the market add some functions to the existing PA system configuration to make system operation more convenient, but they do not change the complex PA system configuration to reduce facility costs, maintenance, and management costs. In this paper, we propose a novel PA system configuration for buildings using audio networks and control hierarchy over peer-to-peer (Anchor) technology based on audio over IP (AoIP), which simplifies the complex PA system configuration and enables convenient operation and management. As a result of the study, through the emergency signal processing algorithm, fire broadcasting was made possible according to the detection of the existence of a fire signal in the Anchor system. In addition, the control device of the PA system was replaced with software to reduce the equipment installation cost, and the PA system configuration was simplified. In the future, it is expected that the PA system using Anchor technology will become the standard for PA facilities.

Modeling of Acoustic Echo Canceller Using Subband Adaptive Signal Processing (서브밴드 적응신호처리를 이용한 음향 에코제거기의 모델링)

  • Kim, Chun-Duck;Sim, Dong-Youn;Chung, Ho-Moon;Lee, Jun-Ku;Cha, Kyung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.5
    • /
    • pp.43-49
    • /
    • 1997
  • Generally, echo cancelers of a TV conference system or a audio conference system are to carry out a real time processing in the case of the closed room having long reverberation time because the system requires much time to modify filter coefficients to environmental changes. Therefore this paper proposes a new subband adaptive filtering method using polyphase filter banks of MPEG(Moving Picture Experts Group) audio system to solve the problems. This method divides signal spectra of input and output into several frequency bands, and each band is adaptively filtered by using ES-NLMS (Exponential Step-Normalized Least Mean Square) algorithm. The optimal number of subband is determined by computational simulations. According to the results of simulation, ERLE of the subband model is 2dB smaller than general full band, calculation rate's of the subband model is decreased about 88%.

  • PDF

An Implementation of Highly Integrated Signal Processing IC for HDTV

  • Hahm Cheul-Hee;Park Kon-Kyu;Kim Hyoung-Gil;Jung Choon-Sik;Lee Sang-keun;Jang Jae-Young;Park Sung-Uk;Chon Byung-Hoan;Chun Kang-Wook;Jo Jae-Moon;Song Dong-il
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2003.11a
    • /
    • pp.69-72
    • /
    • 2003
  • This paper presents a signal processing IC for digital HDTV, which is designed to operate in bunt-in HDW or in HD-set-top Box. The chip supports de-multiplexing an ISO/IEC 13818-1 MPEG-2 TS stream. It decodes MPEG-2 MP@HL video bitstream, and provides high-quality scaled video for display on HDTV monitor. The chip consists of ARM7TDMI for TS-Demux, PCI interface, Audio interface, MPEG2 MP@HL video decoder Display processor, Graphic processor, Memory controller, Audio int3face, Smart Card interface and UART. It is fabricated using Sam sung's 0.18-um and the package of 492-pin BGA is used.

  • PDF

Audio Contents Adaptation Technology According to User′s Preference on Sound Fields (사용자의 음장선호도에 따른 오디오 콘텐츠 적응 기술)

  • 강경옥;홍재근;서정일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.6
    • /
    • pp.437-445
    • /
    • 2004
  • In this paper. we describe a novel method for transforming audio contents according to user's preference on sound field. Sound field effect technologies. which transform or simulate acoustic environments as user's preference, are very important for enlarging the reality of acoustic scene. However huge amount of computational power is required to process sound field effect in real time. so it is hard to implement this functionality at the portable audio devices such as MP3 player. In this paper, we propose an efficient method for providing sound field effect to audio contents independent of terminal's computational power through processing this functionality at the server using user's sound field preference, which is transfered from terminal side. To describe sound field preference, user can use perceptual acoustic parameters as well as the URI address of room impulse response signal. In addition, a novel fast convolution method is presented to implement a sound field effect engine as a result of convoluting with a room impulse response signal at the realtime application. and verified to be applicable to real-time applications through experiments. To verify the evidence of benefit of proposed method we performed two subjective listening tests about sound field descrimitive ability and preference on sound field processed sounds. The results showed that the proposed sound field preference can be applicable to the public.