• 제목/요약/키워드: Audio Technology

Search Result 638, Processing Time 0.024 seconds

High Embedding Capacity and Robust Audio Watermarking for Secure Transmission Using Tamper Detection

  • Kaur, Arashdeep;Dutta, Malay Kishore
    • ETRI Journal
    • /
    • v.40 no.1
    • /
    • pp.133-145
    • /
    • 2018
  • Robustness, payload, and imperceptibility of audio watermarking algorithms are contradictory design issues with high-level security of the watermark. In this study, the major issue in achieving high payload along with adequate robustness against challenging signal-processing attacks is addressed. Moreover, a security code has been strategically used for secure transmission of data, providing tamper detection at the receiver end. The high watermark payload in this work has been achieved by using the complementary features of third-level detailed coefficients of discrete wavelet transform where the human auditory system is not sensitive to alterations in the audio signal. To counter the watermark loss under challenging attacks at high payload, Daubechies wavelets that have an orthogonal property and provide smoother frequencies have been used, which can protect the data from loss under signal-processing attacks. Experimental results indicate that the proposed algorithm has demonstrated adequate robustness against signal processing attacks at 4,884.1 bps. Among the evaluators, 87% have rated the proposed algorithm to be remarkable in terms of transparency.

CoNSIST : Consist of New methodologies on AASIST, leveraging Squeeze-and-Excitation, Positional Encoding, and Re-formulated HS-GAL

  • Jae-Hoon Ha;Joo-Won Mun;Sang-Yup Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.692-695
    • /
    • 2024
  • With the recent advancements in artificial intelligence (AI), the performance of deep learning-based audio deepfake technology has significantly improved. This technology has been exploited for criminal activities, leading to various cases of victimization. To prevent such illicit outcomes, this paper proposes a deep learning-based audio deepfake detection model. In this study, we propose CoNSIST, an improved audio deepfake detection model, which incorporates three additional components into the graph-based end-to-end model AASIST: (i) Squeeze and Excitation, (ii) Positional Encoding, and (iii) Reformulated HS-GAL, This incorporation is expected to enable more effective feature extraction, elimination of unnecessary operations, and consideration of more diverse information, thereby improving the performance of the original AASIST. The results of multiple experiments indicate that CoNSIST has enhanced the performance of audio deepfake detection compared to existing models.

Audio Marker Detection of the implementation for Effective Augmented Reality (효과적인 증강현실 구현을 위한 오디오 마커 검출)

  • Jeon, Soo-Jin;Kim, Young-Seop
    • Journal of the Semiconductor & Display Technology
    • /
    • v.10 no.2
    • /
    • pp.121-124
    • /
    • 2011
  • Augmented Reality integrates virtual objects onto a real world so that it extends the human's sensibility of real-world. an Augmented Reality technology combines real and virtual object in a real environment, and runs interactive in real time, and is regarded as an emerging technology in a large part of the future of information technology. So the benefits for the various businesses are estimated to be very high. In this paper, combine ARToolkit with OpenAL we can provide audio to users. These proposed methodologies will contribute to a better immersive realization of the conventional Augmented Reality system.

DCT and DWT Based Robust Audio Watermarking Scheme for Copyright Protection

  • Deb, Kaushik;Rahman, Md. Ashikur;Sultana, Kazi Zakia;Sarker, Md. Iqbal Hasan;Chong, Ui-Pil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.1
    • /
    • pp.1-8
    • /
    • 2014
  • Digital watermarking techniques are attracting attention as a proper solution to protect copyright for multimedia data. This paper proposes a new audio watermarking method based on Discrete Cosine Transformation (DCT) and Discrete Wavelet Transformation (DWT) for copyright protection. In our proposed watermarking method, the original audio is transformed into DCT domain and divided into two parts. Synchronization code is applied on the signal in first part and 2 levels DWT domain is applied on the signal in second part. The absolute value of DWT coefficient is divided into arbitrary number of segments and calculates the energy of each segment and middle peak. Watermarks are then embedded into each middle peak. Watermarks are extracted by performing the inverse operation of watermark embedding process. Experimental results show that the hidden watermark data is robust to re-sampling, low-pass filtering, re-quantization, MP3 compression, cropping, echo addition, delay, and pitch shifting, amplitude change. Performance analysis of the proposed scheme shows low error probability rates.

A Study on the Customer Satisfaction for Smart Audio's Concept Features through the Kano Model (카노모델(Kano Model)을 이용한 스마트 오디오 컨셉 기능의 고객만족에 관한 연구)

  • Shin, HoonChul;Kim, Jonghak;Park, Young-Taek
    • Journal of Korean Society for Quality Management
    • /
    • v.44 no.4
    • /
    • pp.951-963
    • /
    • 2016
  • Purpose: This study was conducted to analyze the potential customer's satisfaction for the concepts of smart audio features and utilize the results when developing the customer-oriented products. Methods: 16 different features were derived via the market research and professionals' interviews. The most satisfactory features were selected through "Kano model", the relative importance of Customer Satisfaction Coefficient, and respondents' preferences from 339 valid survey answers. Results: 15 out of the 16 features were categorized as attractive attribute. "'User Recognizing' and 'Strengthen Linking' groups", such as Auto connection with Smart-phone music player, Synchronization of TV & Audio, and Volume control situational awareness, were shown to provide higher satisfactions to those potential customers. On the other hand, Group 'Integrating Function', such as Aromatherapy and Auto lighting reaction, was shown to be relatively least preferred features. Conclusion: This study enabled which features could lead to the customer satisfaction. Nevertheless, it still requires extensive analyses in different countries and diverse cultures to target the global market. The audio product planners and R&D professionals are expected to learn useful information from such studies.

Design on MPEC2 AAC Decoder

  • NOH, Jin Soo;Kang, Dongshik;RHEE, Kang Hyeon
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1567-1570
    • /
    • 2002
  • This paper deals with FPGA(Field Programmable Gate Array) implementation of the AAC(Advanced Audio Coding) decoder. On modern computer culture, according to the high quality data is required in multimedia systems area such as CD, DAT(Digital Audio Tape) and modem. So, the technology of data compression far data transmission is necessity now. MPEG(Moving Picture Experts Group) would be a standard of those technology. MPEG-2 AAC is the availableness and ITU-R advanced coding scheme far high quality audio coding. This MPEG-2 AAC audio standard allows ITU-R 'indistinguishable' quality according to at data rates of 320 Kbit/sec for five full-bandwidth channel audio signals. The compression ratio is around a factor of 1.4 better compared to MPEG Layer-III, it gets the same quality at 70% of the titrate. In this paper, for a real time processing MPEG2 AAC decoding, it is implemented on FPGA chip. The architecture designed is composed of general DSP(Digital Signal Processor). And the Processor designed is coded using VHDL language. The verification is operated with the simulator of C language programmed and ECAD tool.

  • PDF

Acoustic Monitoring and Localization for Social Care

  • Goetze, Stefan;Schroder, Jens;Gerlach, Stephan;Hollosi, Danilo;Appell, Jens-E.;Wallhoff, Frank
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.40-50
    • /
    • 2012
  • Increase in the number of older people due to demographic changes poses great challenges to the social healthcare systems both in the Western and as well as in the Eastern countries. Support for older people by formal care givers leads to enormous temporal and personal efforts. Therefore, one of the most important goals is to increase the efficiency and effectiveness of today's care. This can be achieved by the use of assistive technologies. These technologies are able to increase the safety of patients or to reduce the time needed for tasks that do not relate to direct interaction between the care giver and the patient. Motivated by this goal, this contribution focuses on applications of acoustic technologies to support users and care givers in ambient assisted living (AAL) scenarios. Acoustic sensors are small, unobtrusive and can be added to already existing care or living environments easily. The information gathered by the acoustic sensors can be analyzed to calculate the position of the user by localization and the context by detection and classification of acoustic events in the captured acoustic signal. By doing this, possibly dangerous situations like falls, screams or an increased amount of coughs can be detected and appropriate actions can be initialized by an intelligent autonomous system for the acoustic monitoring of older persons. The proposed system is able to reduce the false alarm rate compared to other existing and commercially available approaches that basically rely only on the acoustic level. This is due to the fact that it explicitly distinguishes between the various acoustic events and provides information on the type of emergency that has taken place. Furthermore, the position of the acoustic event can be determined as contextual information by the system that uses only the acoustic signal. By this, the position of the user is known even if she or he does not wear a localization device such as a radio-frequency identification (RFID) tag.

Audio Listening Enhancement in Adverse Environment based on Loudness Restoration (라우드니스 복원에 기반한 잡음 환경에서의 오디오 청취 향상)

  • Pak, Junhyeong;Shin, Jong Won
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.12
    • /
    • pp.210-216
    • /
    • 2013
  • It is hard to listen to the music clearly in the presence of background noise. In this paper, a method that modifies the audio signal automatically to enhance the audio listening experience in adverse environment is proposed. Specifically, the method that amplifies the audio signal so that the perceived loudness of audio signal in each band becomes similar to that of the noiseless signal. The loudness perception model proposed by Moore et. al is utilized. Extending the previous work that is applied to speech reinforcement, the full band signal sampled at 48kHz is manipulated based on the loudness restoration principle. Moreover, based on the observation that the audio clarity is compromised even with loudness restored signal, a modification that intentionally boosts high frequency loudness more than lower band is also proposed. Experimental results showed that the proposed algorithm can enhance the audio listening experience in adverse environment.

A Low Power Multi-Function Digital Audio SoC

  • Lim, Chae-Duck;Lee, Kyo-Sik
    • Proceedings of the IEEK Conference
    • /
    • 2004.06b
    • /
    • pp.399-402
    • /
    • 2004
  • This paper presents a system-on-chip prototype implementing a full integration for a portable digital audio system. The chip is composed of a audio processor block to implements audio decoding and voice compression or decompression software, a system control block including 8-bit MCU core and Memory Management Unit (MMU) a low power 16-bit ${\Sigma}{\Delta}$ CODEC, two DC-to-BC converter, and a flash memory controller. In order to support other audio algorithms except Mask ROM type's fixed codes, a novel 16-bit fixed-point DSP core with the program-download architecture is proposed. Funker, an efficient power management technique such as task-based clock management is implemented to reduce power consumption for portable application. The proposed chip has been fabricated with a 4 metal 0.25um CMOS technology and the chip area is about 7.1 mm ${\times}$ 7.1mm with 100mW power dissipation at 2.5V power supply.

  • PDF

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.3E
    • /
    • pp.89-94
    • /
    • 2006
  • For audio indexing and targeted search of specific audio or corresponding visual contents, the MPEG-7 standard has adopted a sound classification framework, in which dimension-reduced Audio Spectrum Projection (ASP) features are used to train continuous hidden Markov models (HMMs) for classification of various sounds. The MPEG-7 employs Principal Component Analysis (PCA) or Independent Component Analysis (ICA) for the dimensional reduction. Other well-established techniques include Non-negative Matrix Factorization (NMF), Linear Discriminant Analysis (LDA) and Discrete Cosine Transformation (DCT). In this paper we compare the performance of different dimensional reduction methods with Gaussian mixture models (GMMs) and HMMs in the classifying video sound clips.