Search | Korea Science

Robust Person Identification Using Optimal Reliability in Audio-Visual Information Fusion

Tariquzzaman, Md.;Kim, Jin-Young;Na, Seung-You;Choi, Seung-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.3E
- /
- pp.109-117
- /
- 2009
Identity recognition in real environment with a reliable mode is a key issue in human computer interaction (HCI). In this paper, we present a robust person identification system considering score-based optimal reliability measure of audio-visual modalities. We propose an extension of the modified reliability function by introducing optimizing parameters for both of audio and visual modalities. For degradation of visual signals, we have applied JPEG compression to test images. In addition, for creating mismatch in between enrollment and test session, acoustic Babble noises and artificial illumination have been added to test audio and visual signals, respectively. Local PCA has been used on both modalities to reduce the dimension of feature vector. We have applied a swarm intelligence algorithm, i.e., particle swarm optimization for optimizing the modified convection function's optimizing parameters. The overall person identification experiments are performed using VidTimit DB. Experimental results show that our proposed optimal reliability measures have effectively enhanced the identification accuracy of 7.73% and 8.18% at different illumination direction to visual signal and consequent Babble noises to audio signal, respectively, in comparison with the best classifier system in the fusion system and maintained the modality reliability statistics in terms of its performance; it thus verified the consistency of the proposed extension.
PDF KSCI

Curriculum Development of Acoustics and Audio Engineering on Digital Convergence Environment (디지털 융복합 환경을 고려한 음향 및 오디오 기술 교육과정 개발)

Oh, Wongeun;Rhee, Esther
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.2
- /
- pp.191-197
- /
- 2013
In this paper, we present the college curriculum for the acoustics and audio engineering on digital convergence environment. For these purposes, we categorized the colleges which linked on the ASA and AES websites into three types according to their characteristics such as an acoustics-oriented type, an applied-acoustics type, and a convergence-oriented type. And a basic step-by-step curriculum model for the audio acoustics is suggested based on the characteristics of the category analysis. The proposed model can be effectively used to build an acoustics and audio technology course at the domestic colleges.
https://doi.org/10.7236/JIIBC.2013.13.2.191 인용 PDF KSCI

Audio Data Transmission Based on The Wavelet Transform for ZigBee Applications (ZigBee 응용을 위한 웨이블릿변환 기반 오디오 데이터 전송)

Chen, Zhenxing;Choi, Eun Chang;Huh, Jae Doo;Kang, Seog Geun
- IEMEK Journal of Embedded Systems and Applications
- /
- v.2 no.1
- /
- pp.31-42
- /
- 2007
A transform coding scheme for the transmission of audio data in ZigBee based wireless personal area networks (WPAN) is presented in this paper. Here, wavelet transform is exploited to encode the features of audio data included mainly in the low frequency region. As a result, it is confirmed that the presented scheme recovers the original audio signals much accurately while it transmits the binary data compressed as 37.5% of the entire data generated without coding scheme. Especially, the mean-squared error between the recovered and original audio data approaches $10^{-4}$ when the signal-to-noise power ratio is sufficiently high. Hence, the presented coding scheme which exploits the wavelet transform is possibly applied for high-quality audio data transmission services in a small-scale sensor network based on ZigBee. Such a result is considered to be applicable as a basic material to update the technical specifications and develop the applications of ZigBee in WPANs.
PDF

An Efficient Audio Watermark Extraction in Time Domain

Kang, Hae-Won;Jung, Sung-Hwan
- Journal of Information Processing Systems
- /
- v.2 no.1
- /
- pp.13-17
- /
- 2006
In this paper, we propose an audio extraction method to decrease the influence of the original signal by modifying the watermarking detection system proposed by P. Bassia et al. In the extraction of the watermark, we employ a simple mean filter to remove the influence of the original signal as a preprocessing of extraction and the repetitive insertion of the watermark. As the result of the experiment, for which we used about 20 kinds of actual audio data, we obtain a watermark detection rate of about 95% and a good performance even after the various signal processing attacks.
https://doi.org/10.3745/JIPS.2006.2.1.013 인용 PDF KSCI

Audio Watermarking through Modification of Tonal Maskers

Lee, Hee-Suk;Lee, Woo-Sun
- ETRI Journal
- /
- v.27 no.5
- /
- pp.608-616
- /
- 2005
Watermarking has become a technology of choice for a broad range of multimedia copyright protection applications. This paper proposes an audio watermarking scheme that uses the modified tonal masker as an embedding carrier for imperceptible and robust audio watermarking. The method of embedding is to select one of the tonal maskers using a secret key, and to then modify the frequency signals that consist of the tonal masker without changing the sound pressure level. The modified tonal masker can be found using the same secret key without the original sound, and the embedded information can be extracted. The results show that the frequency signals are stable enough to keep embedded watermarks against various common signal processing types, while at the same time the proposed scheme has a robust performance.
PDF

Collocated Wearable Interaction for Audio Book Application on Smartwatch and Hearables

Yoon, Hyoseok;Son, Jangmi
- Journal of Multimedia Information System
- /
- v.7 no.2
- /
- pp.107-114
- /
- 2020
This paper proposes a wearable audio book application using two wearable devices, a smartwatch and a hearables. We review requirements of what could be a killer wearable application and design our application based on these elicited requirements. To distinguish our application, we present 7 scenarios and introduce several wearable interaction modalities. To show feasibility of our approach, we design and implement our proof-of-concept prototype on Android emulator as well as on a commercial smartwatch. We thoroughly address how different interaction modalities are designed and implemented in the Android platform. Lastly, we show latency of the multi-modal and alternative interaction modalities that can be gracefully handled in wearable audio application use cases.
https://doi.org/10.33851/JMIS.2020.7.2.107 인용 PDF KSCI HTML

A Study of the Development System of 3D Sound Contents (3D사운드 컨텐츠 개발 시스템의 고찰)

Yi, Woo-Seock;Kim, Kyung-Sik
- Journal of Korea Game Society
- /
- v.2 no.2
- /
- pp.72-77
- /
- 2002
Sound contents of these days are in the trend of 3D surround audio which has much higher quality then CD audio with the aid of DVD, Development environments of sound contents should be advanced and the efficient application of this system is quite necessary. In this paper, we proposed an improving method for utilizing sound/audio card systems which are necessory for developing game and multimedia sound contents as well as relate know-hows of producing 3D sounds with sound production tools for their efficient usages and applications.
PDF

Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High-Resolution Spectral Features

Kim, Hyoung-Gook;Kim, Jin Young
- ETRI Journal
- /
- v.39 no.6
- /
- pp.832-840
- /
- 2017
Recently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems. In this paper, we present an approach to improve the accuracy of polyphonic sound event detection in multichannel audio based on gated recurrent neural networks in combination with auditory spectral features. In the proposed method, human hearing perception-based spatial and spectral-domain noise-reduced harmonic features are extracted from multichannel audio and used as high-resolution spectral inputs to train gated recurrent neural networks. This provides a fast and stable convergence rate compared to long short-term memory recurrent neural networks. Our evaluation reveals that the proposed method outperforms the conventional approaches.
https://doi.org/10.4218/etrij.17.0117.0157 인용 PDF KSCI

Car-audio Position Evaluation Using 3-Dimensional Motion Analysis (동작분석을 이용한 카 오디오 위치 평가)

임창주;임치환
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.24 no.62
- /
- pp.79-87
- /
- 2001
Usability has become a primary factor in determining the acceptability and consequent success of consumer product. It is common that the product usability is evaluated by objective performance measures and/or subjective user preference measures. This study is concerned with objective evaluation of the product usability using 3-dimensional motion analysis. We evaluated car-audio position using 3-dimensional motion analysis. The parameters investigated in this experiment were height of car-audio, left-right angle, and front-end angle. The experimental results showed that the usability evaluation method using motion analysis was consistent with user's subjective assessment. This objective method can be applied to not only car-audio position evaluation but also various consumer products'usability evaluation.
PDF

Collision Hazards Detection for Construction Workers Safety Using Equipment Sound Data

Elelu, Kehinde;Le, Tuyen;Le, Chau
- International conference on construction engineering and project management
- /
- 2022.06a
- /
- pp.736-743
- /
- 2022
Construction workers experience a high rate of fatal incidents from mobile equipment in the industry. One of the major causes is the decline in the acoustic condition of workers due to the constant exposure to construction noise. Previous studies have proposed various ways in which audio sensing and machine learning techniques can be used to track equipment's movement on the construction site but not on the audibility of safety signals. This study develops a novel framework to help automate safety surveillance in the construction site. This is done by detecting the audio sound at a different signal-to-noise ratio of -10db, -5db, 0db, 5db, and 10db to notify the worker of imminent dangers of mobile equipment. The scope of this study is focused on developing a signal processing model to help improve the audible sense of mobile equipment for workers. This study includes three-phase: (a) collect audio data of construction equipment, (b) develop a novel audio-based machine learning model for automated detection of collision hazards to be integrated into intelligent hearing protection devices, and (c) conduct field experiments to investigate the system' efficiency and latency. The outcomes showed that the proposed model detects equipment correctly and can timely notify the workers of hazardous situations.
PDF

Search Result 818, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)