• Title/Summary/Keyword: Audio signal processing

Search Result 157, Processing Time 0.024 seconds

Design of a New Audio Watermarking System Based on Human Auditory System (청각시스템을 기반으로 한 새로운 오디오 워터마킹 시스템 설계)

  • Shin, Dong-Hwan;Shin Seung-Won;Kim, Jong-Weon;Choi, Jong-Uk;Kim, Duck-Young;Kim, Sung-Hwan
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.51 no.7
    • /
    • pp.308-316
    • /
    • 2002
  • In this paper, we propose a robust digital copyright-protection technique based on the concept of human auditory system. First, we propose a watermarking technique that accepts the various attacks such as, time scaling, pitch shift, add noise and a lot of lossy compression such as MP3, AAC WMA. Second, we implement audio PD(portable device) for copyright protection using proposed method. The proposed watermarking technique is developed using digital filtering technique. Being designed according to critical band of HAS(human auditory system), the digital filers embed watermark without nearly affecting audio quality. Before processing of digital filtering, wavelet transform decomposes the input audio signal into several signals that are composed of specific frequencies. Then, we embed watermark in the decomposed signal (0kHz~11kHz) by designed band-stop digital filer. Watermarking detection algorithm is implemented on audio PD(portable device). Proposed watermarking technology embeds 2bits information per 15 seconds. If PD detects watermark '11', which means illegal song. PD displays "Illegal Song" message on LCD, skips the song and plays the next song, The implemented detection algorithm in PD requires 19 MHz computational power, 7.9kBytes ROM and 10kBytes RAM. The suggested technique satisfies SDMI(secure digital music initiative) requirements of platform3 based on ARM9E core.

Energy-Efficient Approximate Speech Signal Processing for Wearable Devices

  • Park, Taejoon;Shin, Kyoosik;Kim, Nam Sung
    • ETRI Journal
    • /
    • v.39 no.2
    • /
    • pp.145-150
    • /
    • 2017
  • As wearable devices are powered by batteries, they need to consume as little energy as possible. To address this challenge, in this article, we propose a synergistic technique for energy-efficient approximate speech signal processing (ASSP) for wearable devices. More specifically, to enable the efficient trade-off between energy consumption and sound quality, we synergistically integrate an approximate multiplier and a successive approximate register analog-to-digital converter using our enhanced conversion algorithm. The proposed ASSP technique provides ~40% lower energy consumption with ~5% higher sound quality than a traditional one that optimizes only the bit width of SSP.

A Compensation of Linear Distortion for Loudspeaker Using the Adaptive Digital Filter (적응 디지탈 필터를 이용한 확성용 스피커의 선형 왜곡 보상)

  • 전희영;차일환
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1995.06a
    • /
    • pp.165-170
    • /
    • 1995
  • In this paper, it is attempted to apply the adaptive digital signal processing to compensate for a linear distortion of a loudspeaker and implement a real time hardware for that purpose. The real time system is implemented by using the DSP56001, a general purpose signal processor, as a host processor and the DSP56200, a cascadable adaptive FIR filter peripheral chip, as an adaptive digital filter. The system has 1000 taps at a 44.1kHz. After inverse modeling of under_compensation_speaker, the system reduces loudspeaker's linear distortions by pre-processing an input audio signal to loudspeaker. The experiment shows satisfactory results; after adaption with white noise as input signal for 60sec, the flat amplitude and linear phase frequency characteristics is found to lie over a wide frequency range of 100Hz to 20kHz.

A comprehensive design cycle for car engine sound: from signal processing to software component to be integrated in the audio system of the vehicle

  • Orange, Francois;Boussard, Patrick
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2012.04a
    • /
    • pp.208-209
    • /
    • 2012
  • This paper describes a comprehensive process and range of design tools and components for providing Improved perception of engine sound for mass production vehicles by the generation of finely tuned engine harmonics.

  • PDF

Automatic melody extraction algorithm using a convolutional neural network

  • Lee, Jongseol;Jang, Dalwon;Yoon, Kyoungro
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.12
    • /
    • pp.6038-6053
    • /
    • 2017
  • In this study, we propose an automatic melody extraction algorithm using deep learning. In this algorithm, feature images, generated using the energy of frequency band, are extracted from polyphonic audio files and a deep learning technique, a convolutional neural network (CNN), is applied on the feature images. In the training data, a short frame of polyphonic music is labeled as a musical note and a classifier based on CNN is learned in order to determine a pitch value of a short frame of audio signal. We want to build a novel structure of melody extraction, thus the proposed algorithm has a simple structure and instead of using various signal processing techniques for melody extraction, we use only a CNN to find a melody from a polyphonic audio. Despite of simple structure, the promising results are obtained in the experiments. Compared with state-of-the-art algorithms, the proposed algorithm did not give the best result, but comparable results were obtained and we believe they could be improved with the appropriate training data. In this paper, melody extraction and the proposed algorithm are introduced first, and the proposed algorithm is then further explained in detail. Finally, we present our experiment and the comparison of results follows.

Digital Watermarking using DCT and Color Coordinate of Human Vision (DCT 변환과 인간시각 칼라좌표계를 이용한 디지털 워터마킹)

  • 박성훈;김정엽;현기호
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.243-246
    • /
    • 2002
  • The proliferation of digitized media(audio, image and video) is creating a processing need for copyright enforcement schemes that protect copyright ownership. we argue that a watermark must be placed in perceptually significant components of a signal if it is to be robust to signal distortions and malicious attack. In this paper, RGB coordinate image is transformed into LUV coordinate, it include the characteristics of, Human vision and then the UV component is transformed into NxN block DCT transform. we propose a technique for embedding the watermark of visually recognizable mark into the middle frequency domain of image.

  • PDF

Audio-visual Spatial Coherence Judgments in the Peripheral Visual Fields

  • Lee, Chai-Bong;Kang, Dae-Gee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.16 no.2
    • /
    • pp.35-39
    • /
    • 2015
  • Auditory and visual stimuli presented in the peripheral visual field were perceived as spatially coincident when the auditory stimulus was presented five to seven degrees outwards from the direction of the visual stimulus. Furthermore, judgments of the perceived distance between auditory and visual stimuli presented in the periphery did not increase when an auditory stimulus was presented in the peripheral side of the visual stimulus. As to the origin of this phenomenon, there would seem to be two possibilities. One is that the participants could not perceptually distinguish the distance on the peripheral side because of the limitation of accuracy perception. The other is that the participants could distinguish the distances, but could not evaluate them because of the insufficient experimental setup of auditory stimuli. In order to confirm which of these two alternative explanations is valid, we conducted an experiment similar to that of our previous study using a sufficient number of loudspeakers for the presentation of auditory stimuli. Results revealed that judgments of perceived distance increased on the peripheral side. This indicates that we can perceive discrimination between audio and visual stimuli on the peripheral side.

Emotion Recognition of Low Resource (Sindhi) Language Using Machine Learning

  • Ahmed, Tanveer;Memon, Sajjad Ali;Hussain, Saqib;Tanwani, Amer;Sadat, Ahmed
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.369-376
    • /
    • 2021
  • One of the most active areas of research in the field of affective computing and signal processing is emotion recognition. This paper proposes emotion recognition of low-resource (Sindhi) language. This work's uniqueness is that it examines the emotions of languages for which there is currently no publicly accessible dataset. The proposed effort has provided a dataset named MAVDESS (Mehran Audio-Visual Dataset Mehran Audio-Visual Database of Emotional Speech in Sindhi) for the academic community of a significant Sindhi language that is mainly spoken in Pakistan; however, no generic data for such languages is accessible in machine learning except few. Furthermore, the analysis of various emotions of Sindhi language in MAVDESS has been carried out to annotate the emotions using line features such as pitch, volume, and base, as well as toolkits such as OpenSmile, Scikit-Learn, and some important classification schemes such as LR, SVC, DT, and KNN, which will be further classified and computed to the machine via Python language for training a machine. Meanwhile, the dataset can be accessed in future via https://doi.org/10.5281/zenodo.5213073.

Design on MPEC2 AAC Decoder

  • NOH, Jin Soo;Kang, Dongshik;RHEE, Kang Hyeon
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1567-1570
    • /
    • 2002
  • This paper deals with FPGA(Field Programmable Gate Array) implementation of the AAC(Advanced Audio Coding) decoder. On modern computer culture, according to the high quality data is required in multimedia systems area such as CD, DAT(Digital Audio Tape) and modem. So, the technology of data compression far data transmission is necessity now. MPEG(Moving Picture Experts Group) would be a standard of those technology. MPEG-2 AAC is the availableness and ITU-R advanced coding scheme far high quality audio coding. This MPEG-2 AAC audio standard allows ITU-R 'indistinguishable' quality according to at data rates of 320 Kbit/sec for five full-bandwidth channel audio signals. The compression ratio is around a factor of 1.4 better compared to MPEG Layer-III, it gets the same quality at 70% of the titrate. In this paper, for a real time processing MPEG2 AAC decoding, it is implemented on FPGA chip. The architecture designed is composed of general DSP(Digital Signal Processor). And the Processor designed is coded using VHDL language. The verification is operated with the simulator of C language programmed and ECAD tool.

  • PDF

Real-Time Implementation of MPEG-1 Audio decoder on ARM RISC (ARM RISC 상에서의 MPEG-1 Audio decoder의 실시간 구현)

  • 김선태
    • Proceedings of the IEEK Conference
    • /
    • 2000.11d
    • /
    • pp.119-122
    • /
    • 2000
  • Recently, many complex DSP (Digital Signal Processing) algorithms have being realized on RISC CPU due to good compilation, low power consumption and large memory space. But, real-time implementation of multiple DSP algorithms on RISC requires the minimum and efficient memory usage and the lower occupancy of CPU. In this thesis, the original floating-point code of MPEG-1 audio decoder is converted to the fixed-point code and then optimized to the efficient assembly code in time-consuming function in accord with RISC feature. Finally, compared with floating-point and fixed-point, about 30 and 3 times speed enhancements are achieved respectively. And 3~4 times memory spaces are spared.

  • PDF