통합 검색 | Korea Science

Retrieval of Broadcast News Using Audio Content Analysis

Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- 제26권3E호
- /
- pp.74-79
- /
- 2007
In this paper, we report our recent work on a indexing and retrieval system of broadcast news using audio content analysis. Key issues addressed in this work are two major parts of the audio indexing system: anchorperson detection based on audio segmentation, and phone-based spoken document retrieval, developed in the framework of the emerging MPEG-7 standard. Experiments are conducted on a database of Britisch broadcast news videos. We discuss the development of the retrieval system, and the evaluation of each part and the retrieval system.
PDF KSCI

A Study on Setting the Minimum and Maximum Distances for Distance Attenuation in MPEG-I Immersive Audio

Lee, Yong Ju;Yoo Jae-hyoun;Jang, Daeyoung;Kang, Kyeongok;Lee, Taejin
- 방송공학회논문지
- /
- 제27권7호
- /
- pp.974-984
- /
- 2022
In this paper, we introduce the minimum and maximum distance setting methods used in geometric distance attenuation processing, which is one of spatial sound reproduction methods. In general, sound attenuation by distance is inversely proportional to distance, that is 1/r law, but when the relative distance between the user and the audio object is very short or long, exceptional processing might be performed by setting the minimum distance or the maximum distance. While MPEG-I Immersive Audio's RM0 uses fixed values for the minimum and maximum distances, this study proposes effective methods for setting the distances considering the signal gain of an audio object. Proposed methods were verified through simulation of the proposed methods and experiments using RM0 renderer.
https://doi.org/10.5909/JBE.2022.27.7.974 인용 PDF KSCI KPUBS

A Novel Audio Watermarking Algorithm for Copyright Protection of Digital Audio

Seok, Jong-Won;Hong, Jin-Woo;Kim, Jin-Woong
- ETRI Journal
- /
- 제24권3호
- /
- pp.181-189
- /
- 2002
Digital watermark technology is now drawing attention as a new method of protecting digital content from unauthorized copying. This paper presents a novel audio watermarking algorithm to protect against unauthorized copying of digital audio. The proposed watermarking scheme includes a psychoacoustic model of MPEG audio coding to ensure that the watermarking does not affect the quality of the original sound. After embedding the watermark, our scheme extracts copyright information without access to the original signal by using a whitening procedure for linear prediction filtering before correlation. Experimental results show that our watermarking scheme is robust against common signal processing attacks and it introduces no audible distortion after watermark insertion.
PDF

청각 및 시가 정보를 이용한 강인한 음성 인식 시스템의 구현 (Constructing a Noise-Robust Speech Recognition System using Acoustic and Visual Information)

이종석;박철훈
- 제어로봇시스템학회논문지
- /
- 제13권8호
- /
- pp.719-725
- /
- 2007
In this paper, we present an audio-visual speech recognition system for noise-robust human-computer interaction. Unlike usual speech recognition systems, our system utilizes the visual signal containing speakers' lip movements along with the acoustic signal to obtain robust speech recognition performance against environmental noise. The procedures of acoustic speech processing, visual speech processing, and audio-visual integration are described in detail. Experimental results demonstrate the constructed system significantly enhances the recognition performance in noisy circumstances compared to acoustic-only recognition by using the complementary nature of the two signals.
https://doi.org/10.5302/J.ICROS.2007.13.8.719 인용 PDF KSCI

디지털 오디오 방송 서비스를 위한 오디오 코덱의 구현 (Implementation of the Audio CODEC for Digital Audio Broadcasting Service)

장대영;홍진우
- 방송공학회논문지
- /
- 제6권1호
- /
- pp.66-71
- /
- 2001
본 논문에서는 디지털 오디오 방송 시스템의 소스 부호화기로 사용하기 위한 AAC (MPEG-2 Advanced Audio Coding) 코덱 시스템의 개발에 관하여 기술한다. 인코더 및 디코더는 ETRI에서 제안한 디지털 오디오 방송 시스템에 접속하기 위해 MPEG-2 (moving Picture Exports Group Phase 2) 시스템의 TS(Transport Stream) 형식으로 입출력한다. 내부 오디오 신호처리를 위한 DSP (Digital Signal Processor)는 TI(Texas Instruments) 사의 TMS320C6701 (Floating point 166 MHz)을 사용하였으며, 인코더 에서는 DSP를 4개까지, 디코더에서는 3개까지 사용하여 구성할 수 있도록 설계하였다. DSP에서는 시스템 제어. 오디오 신호 입 력. 오디오 신호 처리, TS 신호 발생, 비트스트림 출력 등의 처리를 수행하며, 각 DSP는 직렬 및 병렬 접속에 의해 데이터를 전 달한다 현재 본 시스템을 사용하여 2채널의 AAC 코덱을 구현하였으며, 이후 본 시스템을 이용하여 멀티채널 AAC 코덱, MPEG-4 오디오 코덱을 구현할 예정이며. DAB 및 디지털 방송 분야에 활용될 것이다.
PDF

TV 스피커의 저주파수 신호 재생 개선 (Improving Low Frequency Signal Reproduction in TV Audio)

마나쉬 아로라;오윤학;김승훈;이혁재;장성철
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 2004년도 춘계학술발표대회 논문집 제23권 1호
- /
- pp.275-278
- /
- 2004
In TV sound system, loudspeakers are subject to severe size constraints. The small size of the transducer affects the low frequency signal performance of the system. Bass signal performance contributes significantly to the user perceived sound quality and a good bass signal reproduction is essential. Increasing the sound energy in the bass signal range is an unviable solution since the gain required are exceedingly high and signal distortion occurs because of the speaker overload. Recently methods are being proposed to invoke low frequency illusion using psychoacoustic phenomena of the missing fundamental. This paper proposes a simple and effective signal processing method to create bass signal illusion in TV speakers using the missing fundamental effect, at a complexity of 12 MIPS on Motorola 56371 audio DSP.
PDF

토널 특성을 이용한 브라인드 오디오 워터마킹 (A Blind Audio Watermarking using the Tonal Characteristic)

이희숙;이우선
- 한국멀티미디어학회논문지
- /
- 제6권5호
- /
- pp.816-823
- /
- 2003
이 논문에서는 토널 특성을 이용한 브라인드 오디오 워터마킹을 제안한다. 먼저 기존의 심리음향연구를 통해 토널의 인지영향에 대해 살펴보고, 토널 성분이 여러 신호처리 후 변동측면에서 매우 안정적인 특성을 가짐을 다른 워터마크에 이용되는 특성들과 비교하여 보였다. 이를 기반으로 토널 마스커를 구성하는 주파수 신호들의 관계를 이용한 브라인드 오디 오 워터마킹(blind audio watermarking) 기법을 제안하였다. 이 기법이 적용된 오디오에 대한 SDG(Subjective Diff-Grades) 음질평가에서 평균 SDG 0.27의 결과를 얻었고 이는 비지각성 면에서 토널의 인지 영향을 이용한 워터마킹이 유용하다고 볼 수 있다. 또한 time shift를 제외한 여러 신호처리 후의 워터마크 추출 결과는 98%이상으로 제안한 워터마킹의 강인성을 보였다. Time shift처리에 대해서는 시간 축 상에서 최적의 위치를 찾아 추출하는 새로운 방법을 적용하여 추출율 90%의 결과를 얻었다.
PDF

항공용 인터콤의 백업 모드 운용을 위한 디지털 방식의 이중화 설계 (The Digital Redundancy Design for Back-up Mode Operation of Aviation Intercom)

정성재;조경학;김동혁;이성우
- 한국항행학회논문지
- /
- 제26권5호
- /
- pp.358-364
- /
- 2022
항공용 인터콤 시스템은 정/부조종사 간 내부 통화 및 조종사와 승무원 간 내부 통화, 초고주파 무전기(U/VHF)와 같은 통신 장비를 통한 외부 통화, 초단파전방향거리탐지기/계기착륙장치(VOR/ILS), 전술 항법 장치(TACAN)와 같은 항법 및 임무 장비 오디오 신호 모니터링, 비행 데이터기록장치(FDR) 및 자료전송 시스템(DTS)으로의 음성 녹음용 오디오 신호 출력, 항공기의 상태와 위협 등에 대한 오디오 경고음/경고 음성 발생 등 항공기 내의 모든 음성 신호에 대한 처리를 담당하는 장비이다. 이러한 항공용 인터콤은 아날로그 오디오 신호의 경우 노이즈에 민감하기 때문에 조종사 및 승무원의 임무 수행을 위해 항공기 내/외부의 전자파 노이즈로부터 오디오 신호를 보호할 수 있는 이중화 설계가 필요하다. 본 논문에서는 항공용 디지털 인터콤의 이중화를 위한 정상/백업 운용모드 및 디지털 방식의 이중화 설계 방안과 제작 및 검증 결과에 대하여 기술한다.
https://doi.org/10.12673/jant.2022.26.5.358 인용 PDF KSCI HTML

Finite Alphabet Control and Estimation

Goodwin, Graham C.;Quevedo, Daniel E.
- International Journal of Control, Automation, and Systems
- /
- 제1권4호
- /
- pp.412-430
- /
- 2003
In many practical problems in signal processing and control, the signal values are often restricted to belong to a finite number of levels. These questions are generally referred to as "finite alphabet" problems. There are many applications of this class of problems including: on-off control, optimal audio quantization, design of finite impulse response filters having quantized coefficients, equalization of digital communication channels subject to intersymbol interference, and control over networked communication channels. This paper will explain how this diverse class of problems can be formulated as optimization problems having finite alphabet constraints. Methods for solving these problems will be described and it will be shown that a semi-closed form solution exists. Special cases of the result include well known practical algorithms such as optimal noise shaping quantizers in audio signal processing and decision feedback equalizers in digital communication. Associated stability questions will also be addressed and several real world applications will be presented.
PDF KSCI

Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism

Liu, Min;Tang, Jun
- Journal of Information Processing Systems
- /
- 제17권4호
- /
- pp.754-771
- /
- 2021
In the task of continuous dimension emotion recognition, the parts that highlight the emotional expression are not the same in each mode, and the influences of different modes on the emotional state is also different. Therefore, this paper studies the fusion of the two most important modes in emotional recognition (voice and visual expression), and proposes a two-mode dual-modal emotion recognition method combined with the attention mechanism of the improved AlexNet network. After a simple preprocessing of the audio signal and the video signal, respectively, the first step is to use the prior knowledge to realize the extraction of audio characteristics. Then, facial expression features are extracted by the improved AlexNet network. Finally, the multimodal attention mechanism is used to fuse facial expression features and audio features, and the improved loss function is used to optimize the modal missing problem, so as to improve the robustness of the model and the performance of emotion recognition. The experimental results show that the concordance coefficient of the proposed model in the two dimensions of arousal and valence (concordance correlation coefficient) were 0.729 and 0.718, respectively, which are superior to several comparative algorithms.
https://doi.org/10.3745/JIPS.02.0161 인용 PDF KSCI

검색결과 155건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)